-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Use match statement in checkers (6) #10552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (97.39%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #10552 +/- ##
==========================================
+ Coverage 95.90% 95.93% +0.03%
==========================================
Files 176 176
Lines 19458 19454 -4
==========================================
+ Hits 18661 18663 +2
+ Misses 797 791 -6
🚀 New features to boost your workflow:
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that it impact performance make me cautious and less eager to merge this kind of refactor. Don't get me wrong there's some banger where a structural pattern matching on astroids.Nodes transform 15 lines into 3... But we should definitely not try to force match everywhere and in particular for string matching I see negative value. (I didn't review everything, and stopped at the refactoring checker)
# "if <compare node>" | ||
if not isinstance(node.test, nodes.Compare): | ||
return False | ||
|
||
# Does not have a single statement in the guard's body | ||
if len(node.body) != 1: | ||
return False | ||
|
||
# Look for a single variable assignment on the LHS and a subscript on RHS | ||
stmt = node.body[0] | ||
if not ( | ||
isinstance(stmt, nodes.Assign) | ||
and len(node.body[0].targets) == 1 | ||
and isinstance(node.body[0].targets[0], nodes.AssignName) | ||
and isinstance(stmt.value, nodes.Subscript) | ||
): | ||
return False | ||
match node: | ||
case nodes.If( | ||
test=nodes.Compare() as test, | ||
body=[ | ||
nodes.Assign( | ||
targets=[nodes.AssignName()], | ||
value=nodes.Subscript(slice=slice_value, value=value), | ||
) | ||
], | ||
): | ||
# "if <compare node>" | ||
# Does not have a single statement in the guard's body | ||
# Look for a single variable assignment on the LHS and a subscript on RHS | ||
pass | ||
case _: | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice !
match call_name: | ||
case "any" | "all": | ||
self.add_message( | ||
"use-a-generator", | ||
node=node, | ||
args=(call_name, inside_comp), | ||
) | ||
case _: | ||
self.add_message( | ||
"consider-using-generator", | ||
node=node, | ||
args=(call_name, inside_comp), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matching on string is always going to impact performance and having an if/elif is not worse... In fact I would argue it's better since the indentation is worse for match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This particular example should actually be faster with match since the if
used to create an inline set every time.
if call_name in {"any", "all"}:
...
Yes, performance for match is partially a concern, but it's really limited to the class
and mapping
patterns. All other ones are pretty close to there if
equivalents if not even faster as the match subject just get's copied on the stack and doesn't need to be reloaded every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if we should go back to if call_name == "any" or call_name == "all":
(it's not unreadable enough to warrant a decrease in perf) just for performances.
But I don't see benefits in readability of match agaisnt if x == "string" / elif x == "otherstring" to be honest, and there's going to be a lot of churn involved if we do the same thing for the whole codebase,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I don't see benefits in readability of match agaisnt if x == "string" / elif x == "otherstring" to be honest, and there's going to be a lot of churn involved if we do the same thing for the whole codebase,
It can look quite nice, especially for more than two case, IMO. An example from an earlier (merged) PR, or just search for case "
pylint/pylint/checkers/format.py
Lines 354 to 373 in 47861c3
match token[1]: | |
case ",": | |
# This is a tuple, which is always acceptable. | |
return | |
case "and" | "or": | |
# 'and' and 'or' are the only boolean operators with lower precedence | |
# than 'not', so parens are only required when they are found. | |
found_and_or = True | |
case "yield": | |
# A yield inside an expression must always be in parentheses, | |
# quit early without error. | |
return | |
case "for": | |
# A generator expression always has a 'for' token in it, and | |
# the 'for' token is only legal inside parens when it is in a | |
# generator expression. The parens are necessary here, so bail | |
# without an error. | |
return | |
case "else": | |
# A generator expression can have an 'else' token in it. |
Overall the churn also isn't as bad as one might think. The pattern isn't that common after all.
I do share your concerns. Unfortunately, even with my CPython PRs, I likely won't be able to fix the situation for 3.10-3.13. At best one of them might get backported to 3.14.1. That being said, the more I work with match especially in the context of pylint, the harder it gets to write the corresponding if statements. I just saw that yesterday as I was working on some more match checks. To me that just shows how much more natural and easier to understand they are compared to multiline if statements. What I'm trying to say is that the performance impact is hopefully temporary but the readability and maintainability improvements are definitely real. -- |
I agree with changing to match for astroid nodes. But let's not take a lot of resource to use match on string either. 3.13 is going to last 4 years and there's a lot of things that we can do that are more valuable than rewriting old pylint code (making match faster in cpython, or creating new pylint check using match from the get go being some of them 😄 ). I had a look at your PR in cpython, great stuff ! |
match name: | ||
case "dict": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted the call_name
one but left this in place. With just match name
, case "dict": ...
and case "set": ...
I think match is an improvement.
This comment has been minimized.
This comment has been minimized.
🤖 According to the primer, this change has no effect on the checked open source code. 🤖🎉 This comment was generated for commit adbae7b |
No description provided.