-
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix kornia table #19
Fix kornia table #19
Conversation
Reviewer's Guide by SourceryThis pull request refactors the benchmark scripts to improve logging, hardware information gathering, and README updating. It also updates the minimum Python version to 3.12 and fixes some minor issues. Updated class diagram for benchmark scriptsclassDiagram
class argparse.ArgumentParser {
+add_argument(...)
+parse_args()
}
class Path {
+exists()
+write_text(...)
+read_text()
+glob(...)
+stem
+open()
}
class logging {
<<module>>
+basicConfig(...)
+getLogger(...)
+info(...)
+warning(...)
+error(...)
}
class pd.DataFrame {
+index
+columns
+loc[...]
+to_csv(...)
+nlargest(...)
+nsmallest(...)
}
class plt {
<<module>>
+subplots(...)
+text(...)
+savefig(...)
+close()
+hist(...)
+axvline(...)
+barh(...)
+yticks(...)
+figtext(...)
+suptitle(...)
+tight_layout(...)
+subplots_adjust(...)
}
class json {
<<module>>
+load(...)
+dump(...)
}
class re {
<<module>>
+search(...)
}
note for logging "Added logging module for better output"
note for Path "Used for file and directory operations"
note for pd.DataFrame "Used for data manipulation and analysis"
note for plt "Used for plotting and visualization"
note for json "Used for reading and writing JSON files"
note for re "Used for regular expressions"
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ternaus - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider adding a comment to the top of the generated markdown files indicating they are auto-generated.
- The hardware information gathering logic is complex; consider simplifying it or adding more tests.
Here's what I looked at during the review
- 🟡 General issues: 1 issue found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟡 Complexity: 1 issue found
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| Resize | **3532 ± 67** | 1083 ± 21 | 2995 ± 70 | 645 ± 13 | 260 ± 9 | 1.18× | | ||
| RandomCrop128 | **111859 ± 1374** | 45395 ± 934 | 21408 ± 622 | 2946 ± 42 | 31450 ± 249 | 2.46× | | ||
| RandomResizedCrop | **4347 ± 37** | - | - | 661 ± 16 | 837 ± 37 | 5.19× | | ||
| Resize | **3532 ± 67** | 1083 ± 21 | 2995 ± 70 | 645 ± 13 | 260 ± 9 | 1.18x | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (typo): Replace '×' with 'x'.
The multiplication symbol should be a lowercase 'x', not '×'.
Suggested implementation:
| Resize | **3532 ± 67** | 1083 ± 21 | 2995 ± 70 | 645 ± 13 | 260 ± 9 | 1.18x |
| RandomCrop128 | **111859 ± 1374** | 45395 ± 934 | 21408 ± 622 | 2946 ± 42 | 31450 ± 249 | 2.46x |
| RandomResizedCrop | **4347 ± 37** | - | - | 661 ± 16 | 837 ± 37 | 5.19x |
] | ||
|
||
# Extract GPU information from each library's results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider refactoring the GPU parsing logic into a helper function to reduce nesting and code duplication, improving readability and maintainability while keeping behavior intact.
Consider refactoring the GPU parsing logic into a helper function. This will reduce nesting and duplicate regex code while keeping behavior intact. For example:
def extract_gpu_info(pytorch_settings) -> str:
import re
# Handle string representation
if isinstance(pytorch_settings, str):
if "gpu_available': True" in pytorch_settings:
gpu_name_match = re.search(r"gpu_name': '([^']+)'", pytorch_settings)
if gpu_name_match:
return gpu_name_match.group(1)
gpu_device_match = re.search(r"gpu_device': ([^,}]+)", pytorch_settings)
if gpu_device_match:
return f"GPU {gpu_device_match.group(1)}"
return "GPU (details unknown)"
# Handle dict representation
elif isinstance(pytorch_settings, dict) and pytorch_settings.get("gpu_available", False):
gpu_name = pytorch_settings.get("gpu_name")
return gpu_name or f"GPU {pytorch_settings.get('gpu_device', 'Unknown')}"
return ""
Then in your system summary function, replace the nested GPU logic with a cleaner call:
for file_path in result_files:
with file_path.open() as f:
lib_data = json.load(f)
library = file_path.stem.replace("_results", "")
lib_metadata = lib_data.get("metadata", {})
thread_settings = lib_metadata.get("thread_settings", {})
pytorch_settings = thread_settings.get("pytorch")
if pytorch_settings:
gpu = extract_gpu_info(pytorch_settings)
if gpu:
gpu_info[library] = gpu
This separates concerns, reduces deep nesting, and makes the code easier to understand and maintain.
@@ -294,19 +349,24 @@ def main(): | |||
"max_speedup": speedups[args.reference_library].max() if not speedups.empty else 0, | |||
"max_speedup_transform": speedups[args.reference_library].idxmax() if not speedups.empty else "N/A", | |||
"min_speedup": speedups[args.reference_library].min() if not speedups.empty else 0, | |||
"min_speedup_transform": speedups[args.reference_library].idxmin() if not speedups.empty else "N/A" | |||
"min_speedup_transform": speedups[args.reference_library].idxmin() if not speedups.empty else "N/A", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Swap if/else branches of if expression to remove negation (swap-if-expression
)
"min_speedup_transform": speedups[args.reference_library].idxmin() if not speedups.empty else "N/A", | |
"min_speedup_transform": "N/A" if speedups.empty else speedups[args.reference_library].idxmin(), |
Explanation
Negated conditions are more difficult to read than positive ones, so it is bestto avoid them where we can. By swapping the
if
and else
conditions around wecan invert the condition and make it positive.
if is_max: | ||
formatted = f"**{formatted}**" | ||
return f"**{formatted}**" | ||
|
||
return formatted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): We've found these issues:
- Lift code into else after jump in control flow (
reintroduce-else
) - Replace if statement with if expression (
assign-if-exp
)
if is_max: | |
formatted = f"**{formatted}**" | |
return f"**{formatted}**" | |
return formatted | |
return f"**{formatted}**" if is_max else formatted |
"""Format time value with optional standard deviation.""" | ||
return f"{time_ms:.2f} ± {std:.2f}" if std is not None else f"{time_ms:.2f}" | ||
|
||
|
||
def get_hardware_info(results: Dict[str, Dict[str, Any]]) -> Dict[str, str]: | ||
def get_hardware_info(results: dict[str, dict[str, Any]]) -> dict[str, str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): We've found these issues:
- Use named expression to simplify assignment and conditional [×3] (
use-named-expression
) - Merge else clause's nested if statement into elif (
merge-else-if-into-elif
) - Low code quality found in get_hardware_info - 11% (
low-code-quality
)
Explanation
The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.
How can you solve this?
It might be worth refactoring this function to make it shorter and more readable.
- Reduce the function length by extracting pieces of functionality out into
their own functions. This is the most important thing you can do - ideally a
function should be less than 10 lines. - Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
sits together within the function rather than being scattered.
hist_color = '#4878D0' # Blue | ||
top_color = '#60BD68' # Green | ||
bottom_color = '#EE6677' # Red | ||
hist_color = "#4878D0" # Blue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): We've found these issues:
- Move assignments closer to their usage (
move-assign
) - Extract code out into function (
extract-method
) - Extract duplicate code into function (
extract-duplicate-method
) - Remove unnecessary calls to
enumerate
when the index is not used [×2] (remove-unused-enumerate
)
|
||
# Save speedups to CSV for reference | ||
csv_path = args.output_dir / f'{args.type}_speedups.csv' | ||
csv_path = args.output_dir / f"{args.type}_speedups.csv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): We've found these issues:
- Extract duplicate code into function (
extract-duplicate-method
) - Convert for loop into dictionary comprehension (
dict-comprehension
)
Summary by Sourcery
Refactors the benchmarking tools to improve code quality, readability, and maintainability. It also updates the documentation and adds logging for better debugging and monitoring.
Enhancements:
Tests: