-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Execute one regex at a time #6
Conversation
giturlparse/parser.py
Outdated
m2 = re.search(regexp, self._url) | ||
|
||
if m1: | ||
regexp.append(regexp2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer we not have the temporary regexp1 and regexp2 variables and simply assign the regexp directly into the list on regexp definition.
giturlparse/parser.py
Outdated
@@ -70,29 +70,32 @@ def parse(self): | |||
'owner': None, | |||
} | |||
|
|||
regexp = (r'^(https?|git|ssh|rsync)\://' | |||
regexp = [] | |||
regexp1 = (r'^(https?|git|ssh|rsync)\://' | |||
'(?:(.+)@)*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this produces an error in pycodestyle
giturlparse/parser.py
Outdated
'([:/](.+)/(.+).git)') | ||
] | ||
|
||
if re.search(regexes[0], self._url): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason to have an array is so that you can do looping.
you'll need to define capture groups in the two regexes for this to work.
then you have one problem: re.sub(r'^:', '', m2.group(4))
needs to be converted to a single capture group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jayvdb, I am not getting how we are going to do looping. Actually, capture groups in each of the regexes are different.
you'll need to define capture groups in the two regexes for this to work.
Can you explain this line a bit more?
Also, I am thinking of replacing m2 with m4 in my other PR which matches all URLs in m2.
So, maybe we don't have to solve re.sub(r'^:', '', m2.group(4))
problem?
giturlparse/parser.py
Outdated
m = re.search(regex, self._url) | ||
try: | ||
d['pathname'] = m.group('path') | ||
except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should prob target the proper exception. I believe python throws a AttributeError
when trying to reference a group that is not found vs catching any Exception.
giturlparse/parser.py
Outdated
try: | ||
if re.search(regex, self._url): | ||
m = re.search(regex, self._url) | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use groupdict
to loop over the values of the match object
giturlparse/parser.py
Outdated
d['owner'] = gdict.get('owner', None) | ||
d['name'] = gdict.get('repo', None) | ||
break | ||
except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
catch precise exceptions please.
and limit the try: except:
to only include the code which will cause that exception
giturlparse/parser.py
Outdated
if re.search(regex, self._url): | ||
m = re.search(regex, self._url) | ||
gdict = m.groupdict() | ||
d['pathname'] = gdict.get('pathname', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None
is unnecessary.
giturlparse/parser.py
Outdated
for regex in regexes: | ||
if re.search(regex, self._url): | ||
m = re.search(regex, self._url) | ||
gdict = m.groupdict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d['user'] = gdict.get('user', None)
you are playing with two dicts with the same member names.
That is a dead giveaway that you should be doing a loop, not playing with each of the members by name.
but more importantly, you can do bulk dict
manipulations.
d.update( m.groupdict() )
Then you need to solve the protocol
= ssh
fallback a smarter way.
@retr0h, looks good? :) |
@retr0h ping. |
Closes retr0h/git-url-parse#5