Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix kornia table #19

Merged
merged 2 commits into from
Mar 12, 2025
Merged

Fix kornia table #19

merged 2 commits into from
Mar 12, 2025

Conversation

ternaus
Copy link
Contributor

@ternaus ternaus commented Mar 12, 2025

Summary by Sourcery

Refactors the benchmarking tools to improve code quality, readability, and maintainability. It also updates the documentation and adds logging for better debugging and monitoring.

Enhancements:

  • Refactor benchmarking tools for improved code quality and maintainability, including adding logging and improving hardware information detection.
  • Update the documentation to reflect the changes in the benchmarking tools.
  • Improve hardware information detection to provide more accurate details in benchmark reports.
  • Update the speedup calculation to use 'x' instead of '×' in the generated reports and documentation for better compatibility and readability.
  • Add auto-generated comment to the top of the README files to indicate that they are automatically generated and should not be edited directly.
  • Update the albumentations hardware info to always use '1 core' as we fix CPU thread in actual benchmark.
  • Update the minimum Python version to 3.12.
  • Update ruff configuration to target Python 3.12 and exclude tools directory from linting.
  • Update mypy configuration to target Python 3.12.
  • Update pre-commit configuration to include tools directory in mypy checks.
  • Update the albumentations hardware info to always use 'CPU (1 core)'

Tests:

  • Add logging to the benchmarking tools for better debugging and monitoring.

Copy link

sourcery-ai bot commented Mar 12, 2025

Reviewer's Guide by Sourcery

This pull request refactors the benchmark scripts to improve logging, hardware information gathering, and README updating. It also updates the minimum Python version to 3.12 and fixes some minor issues.

Updated class diagram for benchmark scripts

classDiagram
    class argparse.ArgumentParser {
        +add_argument(...)
        +parse_args()
    }

    class Path {
        +exists()
        +write_text(...)
        +read_text()
        +glob(...)
        +stem
        +open()
    }

    class logging {
     <<module>>
        +basicConfig(...)
        +getLogger(...)
        +info(...)
        +warning(...)
        +error(...)
    }

    class pd.DataFrame {
        +index
        +columns
        +loc[...]
        +to_csv(...)
        +nlargest(...)
        +nsmallest(...)
    }

    class plt {
     <<module>>
        +subplots(...)
        +text(...)
        +savefig(...)
        +close()
        +hist(...)
        +axvline(...)
        +barh(...)
        +yticks(...)
        +figtext(...)
        +suptitle(...)
        +tight_layout(...)
        +subplots_adjust(...)
    }

    class json {
     <<module>>
        +load(...)
        +dump(...)
    }

    class re {
     <<module>>
        +search(...)
    }

    note for logging "Added logging module for better output"
    note for Path "Used for file and directory operations"
    note for pd.DataFrame "Used for data manipulation and analysis"
    note for plt "Used for plotting and visualization"
    note for json "Used for reading and writing JSON files"
    note for re "Used for regular expressions"
Loading

File-Level Changes

Change Details Files
Refactor benchmark scripts to improve logging, hardware information gathering, and README updating.
  • Configured logging with timestamps and level names.
  • Improved hardware information gathering, especially for GPU-based libraries.
  • Added an auto-generated comment to the top of README files.
  • Updated the speedup calculation to use 'x' instead of '×'.
  • Fixed CPU core count for Albumentations to always use 1 core.
  • Added support for torchvision in video benchmarks.
  • Added more exception handling.
  • Added type hints.
  • Removed unused imports.
tools/generate_speedup_plots.py
tools/compare_video_results.py
tools/compare_results.py
docs/images/README.md
docs/images/results.md
docs/videos/README.md
docs/videos/results.md
pyproject.toml
.pre-commit-config.yaml
Update minimum Python version to 3.12.
  • Changed requires-python in pyproject.toml to be '>=3.12'.
  • Removed Python 3.7, 3.8, 3.9, 3.10, and 3.11 from the classifiers in pyproject.toml.
pyproject.toml

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@ternaus ternaus merged commit 8160c25 into main Mar 12, 2025
1 check passed
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ternaus - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider adding a comment to the top of the generated markdown files indicating they are auto-generated.
  • The hardware information gathering logic is complex; consider simplifying it or adding more tests.
Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

| Resize | **3532 ± 67** | 1083 ± 21 | 2995 ± 70 | 645 ± 13 | 260 ± 9 | 1.18× |
| RandomCrop128 | **111859 ± 1374** | 45395 ± 934 | 21408 ± 622 | 2946 ± 42 | 31450 ± 249 | 2.46× |
| RandomResizedCrop | **4347 ± 37** | - | - | 661 ± 16 | 837 ± 37 | 5.19× |
| Resize | **3532 ± 67** | 1083 ± 21 | 2995 ± 70 | 645 ± 13 | 260 ± 9 | 1.18x |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (typo): Replace '×' with 'x'.

The multiplication symbol should be a lowercase 'x', not '×'.

Suggested implementation:

| Resize               | **3532 ± 67**             | 1083 ± 21        | 2995 ± 70         | 645 ± 13          | 260 ± 9                 | 1.18x                            |

| RandomCrop128        | **111859 ± 1374**         | 45395 ± 934      | 21408 ± 622       | 2946 ± 42         | 31450 ± 249             | 2.46x                            |

| RandomResizedCrop    | **4347 ± 37**             | -                | -                 | 661 ± 16          | 837 ± 37                | 5.19x                            |

]

# Extract GPU information from each library's results
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider refactoring the GPU parsing logic into a helper function to reduce nesting and code duplication, improving readability and maintainability while keeping behavior intact.

Consider refactoring the GPU parsing logic into a helper function. This will reduce nesting and duplicate regex code while keeping behavior intact. For example:

def extract_gpu_info(pytorch_settings) -> str:
    import re
    # Handle string representation
    if isinstance(pytorch_settings, str):
        if "gpu_available': True" in pytorch_settings:
            gpu_name_match = re.search(r"gpu_name': '([^']+)'", pytorch_settings)
            if gpu_name_match:
                return gpu_name_match.group(1)
            gpu_device_match = re.search(r"gpu_device': ([^,}]+)", pytorch_settings)
            if gpu_device_match:
                return f"GPU {gpu_device_match.group(1)}"
            return "GPU (details unknown)"
    # Handle dict representation
    elif isinstance(pytorch_settings, dict) and pytorch_settings.get("gpu_available", False):
        gpu_name = pytorch_settings.get("gpu_name")
        return gpu_name or f"GPU {pytorch_settings.get('gpu_device', 'Unknown')}"
    return ""

Then in your system summary function, replace the nested GPU logic with a cleaner call:

for file_path in result_files:
    with file_path.open() as f:
        lib_data = json.load(f)
    library = file_path.stem.replace("_results", "")
    lib_metadata = lib_data.get("metadata", {})
    thread_settings = lib_metadata.get("thread_settings", {})
    pytorch_settings = thread_settings.get("pytorch")
    if pytorch_settings:
        gpu = extract_gpu_info(pytorch_settings)
        if gpu:
            gpu_info[library] = gpu

This separates concerns, reduces deep nesting, and makes the code easier to understand and maintain.

@@ -294,19 +349,24 @@ def main():
"max_speedup": speedups[args.reference_library].max() if not speedups.empty else 0,
"max_speedup_transform": speedups[args.reference_library].idxmax() if not speedups.empty else "N/A",
"min_speedup": speedups[args.reference_library].min() if not speedups.empty else 0,
"min_speedup_transform": speedups[args.reference_library].idxmin() if not speedups.empty else "N/A"
"min_speedup_transform": speedups[args.reference_library].idxmin() if not speedups.empty else "N/A",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Swap if/else branches of if expression to remove negation (swap-if-expression)

Suggested change
"min_speedup_transform": speedups[args.reference_library].idxmin() if not speedups.empty else "N/A",
"min_speedup_transform": "N/A" if speedups.empty else speedups[args.reference_library].idxmin(),


ExplanationNegated conditions are more difficult to read than positive ones, so it is best
to avoid them where we can. By swapping the if and else conditions around we
can invert the condition and make it positive.

Comment on lines 34 to 37
if is_max:
formatted = f"**{formatted}**"
return f"**{formatted}**"

return formatted
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): We've found these issues:

Suggested change
if is_max:
formatted = f"**{formatted}**"
return f"**{formatted}**"
return formatted
return f"**{formatted}**" if is_max else formatted

"""Format time value with optional standard deviation."""
return f"{time_ms:.2f} ± {std:.2f}" if std is not None else f"{time_ms:.2f}"


def get_hardware_info(results: Dict[str, Dict[str, Any]]) -> Dict[str, str]:
def get_hardware_info(results: dict[str, dict[str, Any]]) -> dict[str, str]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): We've found these issues:


Explanation

The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

  • Reduce the function length by extracting pieces of functionality out into
    their own functions. This is the most important thing you can do - ideally a
    function should be less than 10 lines.
  • Reduce nesting, perhaps by introducing guard clauses to return early.
  • Ensure that variables are tightly scoped, so that code using related concepts
    sits together within the function rather than being scattered.

hist_color = '#4878D0' # Blue
top_color = '#60BD68' # Green
bottom_color = '#EE6677' # Red
hist_color = "#4878D0" # Blue
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): We've found these issues:


# Save speedups to CSV for reference
csv_path = args.output_dir / f'{args.type}_speedups.csv'
csv_path = args.output_dir / f"{args.type}_speedups.csv"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): We've found these issues:

@ternaus ternaus deleted the fix_kornia_table branch March 12, 2025 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant