add CI to detect performance regressions #53

2bndy5 · 2024-10-03T10:50:18Z

Compares two release builds of cpp-linter binary and pure python package:

the previous commit (for push events) or the base branch of a PR
the newest commit on the branch
the latest v1.x release of the pure-python cpp-linter package

Caching is enabled to reduce CI runtime.

Results are output to the CI workflow's job summary. This CI does not (currently) fail when a regression is detected.

Summary by CodeRabbit

New Features
- Introduced a new GitHub Actions workflow for automated performance regression testing of the cpp-linter.
- Added a script for analyzing performance benchmarks, providing insights on performance changes between builds.
Bug Fixes
- Improved handling of performance regression detection with clear output messages for users.

2bndy5 · 2024-10-04T00:06:48Z

This is bugging me 😡

Locally I invoke the same commands (using the same exact machine dual-booted):

on Windows it takes around 12 seconds
on Linux it takes around 55 seconds

In the CI workflow (which uses ubuntu-latest), it takes around 165 seconds to run!

At least the runtime is consistent between pure-python and pure-rust. It does not seem to matter if I use clang v18 or v14.

Compares two release builds of cpp-linter binary: 1. the previous commit (for push events) or the base branch of a PR 2. the newest commit on the branch 3. the latest v1.x release of the pure-python cpp-linter package Caching is enabled to reduce CI runtime. Results are output to the CI workflow's job summary. This CI does not (currently) fail when a regression is detected.

coderabbitai · 2024-10-04T03:46:43Z

Walkthrough

A new performance regression testing workflow has been added to the cpp-linter project through the introduction of the perf-test.yml file in the GitHub Actions workflows. This workflow includes three jobs: building the project for current and previous commits, benchmarking performance differences, and reporting when no source changes occur. Additionally, a new script named perf_annotate.py has been created to analyze benchmark results from a JSON file, providing insights into performance changes and potential regressions.

Changes

File Path	Change Summary
.github/workflows/perf-test.yml	Introduced a new workflow for performance regression testing with jobs for building, benchmarking, and reporting.
.github/workflows/perf_annotate.py	Added a script to analyze performance benchmarks from a JSON file, calculating differences and outputting results.

Possibly related PRs

feat: add optional colored log output #52: Modifications to logging functionality, including a new dependency for colored log output, which may relate to the logging aspects of the performance testing workflow.

Poem

🐰 In the meadow, where benchmarks play,
A new workflow hops in, brightening the day.
With builds and tests, it dances around,
Reporting changes, where performance is found.
So let’s cheer for the code, both swift and spry,
As we measure and analyze, oh me, oh my! 🌼

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 0bdda0a and 409a238.

📒 Files selected for processing (2)

.github/workflows/perf-test.yml (1 hunks)
.github/workflows/perf_annotate.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

.github/workflows/perf-test.yml
.github/workflows/perf_annotate.py

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

2bndy5 · 2024-10-04T03:47:39Z

I give up for now. I may play with this later when I experiment with improving the async performance in rust.

coderabbitai

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (2)

.github/workflows/perf-test.yml (2)
22-63: LGTM: Build job is well-structured with efficient caching.

The build job is well-designed, using a matrix strategy to build both current and previous commits efficiently. The caching mechanism is implemented correctly, which should significantly improve performance across runs.

One minor suggestion for improvement:

Consider adding a step to validate the cached binaries, ensuring they are not corrupted or incomplete. This could be a simple check like verifying the file size or running a quick command with the binary.

Example:
- name: Validate cached binary
  if: steps.cache.outputs.cache-hit == 'true'
  run: |
    if [ ! -x target/release/cpp-linter ] || ! target/release/cpp-linter --version; then
      echo "Cached binary is invalid, rebuilding..."
      echo "cache-hit=false" >> $GITHUB_OUTPUT
    fi
🧰 Tools

🪛 actionlint

50-50: shellcheck reported issue in this script: SC2086:info:1:58: Double quote to prevent globbing and word splitting

(shellcheck)

54-54: shellcheck reported issue in this script: SC2086:info:1:58: Double quote to prevent globbing and word splitting

(shellcheck)

50-50: Minor: Address shellcheck warnings for improved script robustness.

There are a few instances where shellcheck has identified potential issues with environment variable usage. While these are minor, addressing them can improve the robustness of the script.

Apply the following changes to address the shellcheck warnings:
- run: echo "is-cached=${{ steps.cache.outputs.cache-hit }}" >> $GITHUB_OUTPUT
+ run: echo "is-cached=${{ steps.cache.outputs.cache-hit }}" >> "$GITHUB_OUTPUT"

- run: echo "is-cached=${{ steps.cache.outputs.cache-hit }}" >> $GITHUB_OUTPUT
+ run: echo "is-cached=${{ steps.cache.outputs.cache-hit }}" >> "$GITHUB_OUTPUT"

- run: cat ${{ runner.temp }}/benchmark.md >> $GITHUB_STEP_SUMMARY
+ run: cat "${{ runner.temp }}/benchmark.md" >> "$GITHUB_STEP_SUMMARY"
These changes add double quotes around the environment variables to prevent potential issues with word splitting or globbing.

Also applies to: 54-54, 121-121

🧰 Tools

🪛 actionlint

50-50: shellcheck reported issue in this script: SC2086:info:1:58: Double quote to prevent globbing and word splitting

(shellcheck)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 023c170 and d7484f6.

📒 Files selected for processing (2)

.github/workflows/perf-test.yml (1 hunks)
.github/workflows/perf_annotate.py (1 hunks)

🧰 Additional context used

🪛 actionlint

.github/workflows/perf-test.yml

50-50: shellcheck reported issue in this script: SC2086:info:1:58: Double quote to prevent globbing and word splitting

(shellcheck)

54-54: shellcheck reported issue in this script: SC2086:info:1:58: Double quote to prevent globbing and word splitting

(shellcheck)

121-121: shellcheck reported issue in this script: SC2086:info:1:40: Double quote to prevent globbing and word splitting

(shellcheck)

🔇 Additional comments (4)

.github/workflows/perf-test.yml (4)
1-21: LGTM: Workflow trigger configuration is well-defined.

The workflow trigger configuration is appropriate for performance regression testing. It runs on pushes and pull requests to the main branch, with specific path filters to ensure it only executes when relevant files are changed. This approach helps to optimize CI resources by avoiding unnecessary runs.

130-135: LGTM: report-no-src-changes job is simple and effective.

The report-no-src-changes job serves its purpose well. It runs only when both builds are cached, indicating no changes to the cpp-linter source code. The simple notice it logs provides clear feedback in the GitHub Actions UI, which is helpful for developers and maintainers.

1-135: Overall: Well-structured workflow with room for addressing environment-specific performance concerns.

The workflow is well-designed and aligns with the PR objectives of implementing CI for performance regression detection. It covers building, benchmarking, and reporting, with efficient use of caching to optimize CI resources.

However, there's an important concern raised in the PR comments that this workflow doesn't address:

The significant performance discrepancies observed between local and CI environments (12s on Windows, 55s on Linux locally, vs 165s in CI) could lead to unreliable performance regression detection. To address this:

Consider adding environment information logging to the benchmark job. This could include details like CPU model, available memory, and any other relevant system information. For example:
- name: Log environment information
  run: |
    echo "CPU Info:"
    lscpu
    echo "Memory Info:"
    free -h
    echo "Disk Info:"
    df -h
Investigate the possibility of using a self-hosted runner that more closely matches the development environment. This could provide more consistent and relevant performance metrics.

Add a step to compare the benchmark results against known baselines for different environments. This could help in normalizing the results and making them more comparable across different setups.

By addressing these points, you can improve the reliability and usefulness of the performance regression detection in your CI process.

🧰 Tools

🪛 actionlint

50-50: shellcheck reported issue in this script: SC2086:info:1:58: Double quote to prevent globbing and word splitting

(shellcheck)

54-54: shellcheck reported issue in this script: SC2086:info:1:58: Double quote to prevent globbing and word splitting

(shellcheck)

121-121: shellcheck reported issue in this script: SC2086:info:1:40: Double quote to prevent globbing and word splitting

(shellcheck)

65-128: 🛠️ Refactor suggestion

LGTM: Benchmark job is comprehensive, with room for minor improvements.

The benchmark job is well-structured and provides a comprehensive performance comparison between different builds. The use of hyperfine for benchmarking and the comparison with the pure-python implementation are excellent choices.

Suggestions for improvement:

Consider increasing the number of runs in the hyperfine command (currently set to 2) for more statistically significant results. A value between 5 and 10 might provide a better balance between accuracy and execution time.

Add warm-up runs to the hyperfine command to reduce the impact of cold starts. For example:
hyperfine
  --warmup 2
  --runs 5
  # ... (rest of the command)
Consider adding a step to compare the benchmark results with a predefined threshold to automatically detect significant performance regressions. This could be done using a custom script or a GitHub Action specifically designed for benchmark comparisons.

To ensure the benchmarking process is consistent and reliable, let's verify the existence and contents of the perf_annotate.py script:

✅ Verification successful

LGTM: Benchmark job verification successful.

The benchmarking setup is correctly implemented with the presence of perf_annotate.py, ensuring performance results are appropriately annotated. The benchmark job configuration is comprehensive and reliable.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify the existence and contents of perf_annotate.py
if [ -f .github/workflows/perf_annotate.py ]; then
  echo "perf_annotate.py exists. Contents:"
  cat .github/workflows/perf_annotate.py
else
  echo "perf_annotate.py does not exist in the expected location."
fi
Length of output: 1837
🧰 Tools

🪛 actionlint

121-121: shellcheck reported issue in this script: SC2086:info:1:40: Double quote to prevent globbing and word splitting

(shellcheck)

.github/workflows/perf_annotate.py

2bndy5 added the enhancement New feature or request label Oct 3, 2024

2bndy5 force-pushed the add-perf-ci branch 5 times, most recently from a58ee67 to a215876 Compare October 3, 2024 22:17

2bndy5 force-pushed the add-perf-ci branch from 47637e6 to 493383a Compare October 4, 2024 03:33

2bndy5 force-pushed the add-perf-ci branch from 0b0e7c2 to d7484f6 Compare October 4, 2024 03:44

2bndy5 marked this pull request as ready for review October 4, 2024 03:46

coderabbitai bot reviewed Oct 4, 2024

View reviewed changes

.github/workflows/perf_annotate.py Outdated Show resolved Hide resolved

.github/workflows/perf_annotate.py Show resolved Hide resolved

.github/workflows/perf_annotate.py Outdated Show resolved Hide resolved

2bndy5 force-pushed the add-perf-ci branch from 0bdda0a to 6f05c36 Compare October 4, 2024 04:37

2bndy5 added 2 commits October 3, 2024 21:53

review changes

14a3bc1

validate cache

409a238

2bndy5 force-pushed the add-perf-ci branch from 6f05c36 to 409a238 Compare October 4, 2024 04:54

2bndy5 merged commit 1ea0ea6 into main Oct 4, 2024
11 checks passed

2bndy5 deleted the add-perf-ci branch October 4, 2024 05:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

add CI to detect performance regressions #53

add CI to detect performance regressions #53

Uh oh!

2bndy5 commented Oct 3, 2024 •

edited by coderabbitai bot

Loading

Uh oh!

2bndy5 commented Oct 4, 2024 •

edited

Loading

Uh oh!

coderabbitai bot commented Oct 4, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

2bndy5 commented Oct 4, 2024

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

add CI to detect performance regressions #53

add CI to detect performance regressions #53

Uh oh!

Conversation

2bndy5 commented Oct 3, 2024 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Summary by CodeRabbit

Uh oh!

2bndy5 commented Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

This is bugging me 😡

Uh oh!

coderabbitai bot commented Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

2bndy5 commented Oct 4, 2024

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

2bndy5 commented Oct 3, 2024 •

edited by coderabbitai bot

Loading

2bndy5 commented Oct 4, 2024 •

edited

Loading

coderabbitai bot commented Oct 4, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)