test: improve performance on our slowest tests #4321

terriko · 2024-08-08T00:43:29Z

in #4319 I'm switching pytest to print our longest duration tests so we can see about improving the performance of our test suite. On a random local run, here's what I saw

======================================== slowest 50 durations ========================================
291.38s call     test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/Cargo.lock-products1]
203.69s call     test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/Gemfile.lock-products2]
119.99s call     test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/requirements.txt-products3]
99.68s call     test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/renv.lock-products0]
49.71s call     test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/go.mod-products6]
38.39s call     test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/package-lock.json-products4]
23.17s call     test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/cpanfile-products9]
11.63s call     test/test_requirements.py::test_requirements
11.18s call     test/test_cli.py::TestCLI::test_EPSS_percentile
11.14s call     test/test_cli.py::TestCLI::test_EPSS_probability
10.74s call     test/test_language_scanner.py::TestLanguageScanner::test_java_package[/home/terri/Code/cve-bin-tool/test/language_data/pom.xml-product_list0]
9.26s setup    test/test_available_fix.py::TestAvailableFixReport::test_debian_backport_fix_output
7.81s call     test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/Package.resolved-products7]
7.31s call     test/test_language_scanner.py::TestLanguageScanner::test_language_package_none_found[/home/terri/Code/cve-bin-tool/test/language_data/fail_pom.xml]
5.22s call     test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/.package-lock.json-products5]
4.83s call     test/test_cli.py::TestCLI::test_sbom_detection
4.83s call     test/test_cli.py::TestCLI::test_CVSS_score
4.82s call     test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/composer.lock-products8]
4.53s call     test/test_cli.py::TestCLI::test_SBOM
3.82s call     test/test_source_purl2cpe.py::TestSourceOSV::test_db_contents[1-False]
2.20s call     test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/pubspec.lock-products10]
2.18s call     test/test_cli.py::TestCLI::test_disabled_sources
2.12s setup    test/test_cli.py::TestCLI::test_extract_bad_zip_messages
2.01s setup    test/test_cli.py::TestCLI::test_sbom_detection
1.65s setup    test/test_cli.py::TestCLI::test_EPSS_probability
1.59s setup    test/test_cli.py::TestCLI::test_SBOM
1.49s call     test/test_output_engine.py::TestOutputEngine::test_output_file_wrapper
1.44s call     test/test_cli.py::TestCLI::test_severity
1.39s setup    test/test_cli.py::TestCLI::test_EPSS_percentile
1.31s call     test/test_cli.py::TestCLI::test_quiet_mode
1.13s call     test/test_merge.py::TestMergeReports::test_valid_merge[filepaths0-merged_data0]
1.00s setup    test/test_cli.py::TestCLI::test_extract_encrypted_zip_messages
0.99s setup    test/test_html.py::TestOutputHTML::test_interactive_mode_print_mode_switching[chromium]
0.95s call     test/test_language_scanner.py::TestLanguageScanner::test_javascript_package_none_found[/home/terri/Code/cve-bin-tool/test/language_data/fail-package-lock.json]
0.89s setup    test/test_cli.py::TestCLI::test_disabled_sources
0.83s setup    test/test_cli.py::TestCLI::test_config_generator[args0-config.yaml-expected_contents0]
0.80s call     test/test_sbom.py::TestSBOM::test_invalid_xml[/home/terri/Code/cve-bin-tool/test/sbom/spdx_test.spdx.xml-cyclonedx-True]
0.73s setup    test/test_cli.py::TestCLI::test_config_generator[args1-config.toml-expected_contents1]
0.70s call     test/test_cli.py::TestCLI::test_extract_bad_zip_messages
0.68s call     test/test_cli.py::TestCLI::test_runs
0.66s call     test/test_sbom.py::TestSBOM::test_invalid_xml[/home/terri/Code/cve-bin-tool/test/sbom/swid_test.xml-cyclondedx-True]
0.65s call     test/test_cli.py::TestCLI::test_skips
0.64s call     test/test_merge.py::TestMergeReports::test_valid_cve_scanner_instance[filepaths0]
0.60s call     test/test_sbom.py::TestSBOM::test_sbom_detection[/home/terri/Code/cve-bin-tool/test/sbom/cyclonedx_test.xml-cyclonedx]
0.55s setup    test/test_cli.py::TestCLI::test_invalid_parameter
0.55s setup    test/test_cli.py::TestCLI::test_version
0.55s setup    test/test_cli.py::TestCLI::test_severity
0.55s setup    test/test_cli.py::TestCLI::test_quiet_mode
0.54s setup    test/test_cli.py::TestCLI::test_skips
0.54s setup    test/test_cli.py::TestCLI::test_usage
====================================== short test summary info =======================================

It looks like our language scanner tests are noticeably slower on my machine. If I had to guess, the primary problem is likely due to the sheer number of products and vulnerabilities those tests look up, so I would start by reducing the test files to look up a minimal number of products and make sure that the products that they look up have a minimal number of vulnerabilities. Exactly how many products you should keep will depend on what's needed to test different parsing and to conform to however a full lock file with dependencies should look for the language, but if you can get enough test coverage with 1 product that has 1 vulnerability, go for it!

It's entirely possible that there's also performance gains to be had in the language scanner code if you want to do a deeper dive there too!

Gyan-max · 2025-02-08T18:01:36Z

Hi @terriko , this looks like an interesting issue! I’d love to help speed up the tests. I’ll start by checking the longest-running ones and see if we can reduce the number of product lookups while keeping the coverage solid. Also, I’ll take a look at the language scanner to see if there are any performance tweaks we can make. Let me know if there are any specific things I should keep in mind. Excited to contribute!

terriko · 2025-02-10T21:19:18Z

@Gyan-max thanks! I think reducing the product lookups is going to make the biggest difference even if we make other performance tweaks, so probably start there.

Shrishti1701 · 2025-03-05T09:55:48Z

@terriko To improve the performance of our test suite, I propose the following solutions:

Reduce Test File Sizes

Limit the number of products in each test file to the minimum required for effective parsing validation.
Ensure each product has only one vulnerability where possible.

Optimize Vulnerability Lookups

Investigate if unnecessary lookups are being performed.
2.Consider caching or mocking responses to reduce execution time.

Enable Parallel Execution
1.Utilize pytest-xdist (pytest -n auto) to run tests in parallel.
Profile and Optimize Code

1.Use pytest --durations=10 --profile to identify performance bottlenecks.
2.Optimize the parsing logic and data structures in language_scanner.py for efficiency.

These steps should help reduce execution time while maintaining test coverage.

SachinMugade8797 · 2025-04-02T17:35:34Z

Hi @terriko, I'd like to work on this issue as part of my GSoC preparation.
Could you assign it to me? I will analyze the test performance and suggest improvements.
Thanks!

terriko added the hackathon Issues for folk participating in the Open Ecosystems hackathon label Aug 8, 2024

This was referenced Aug 8, 2024

test: move language scanner tests to longtests #4322

Closed

feat: add support for yarn (fixes #4266) #4290

Merged

terriko added the hacktoberfest good issue for hacktoberfest participation label Oct 1, 2024

terriko added the good first issue Good for newcomers label Jan 10, 2025

Gyan-max linked a pull request Apr 7, 2025 that will close this issue

test: improve performance of language scanner tests #5011

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test: improve performance on our slowest tests #4321

test: improve performance on our slowest tests #4321

terriko commented Aug 8, 2024

Gyan-max commented Feb 8, 2025

Uh oh!

terriko commented Feb 10, 2025

Uh oh!

Shrishti1701 commented Mar 5, 2025

Uh oh!

SachinMugade8797 commented Apr 2, 2025

Uh oh!

test: improve performance on our slowest tests #4321

test: improve performance on our slowest tests #4321

Comments

terriko commented Aug 8, 2024

Gyan-max commented Feb 8, 2025

Uh oh!

terriko commented Feb 10, 2025

Uh oh!

Shrishti1701 commented Mar 5, 2025

Uh oh!

SachinMugade8797 commented Apr 2, 2025

Uh oh!