adding benchmarking, currently only for http funcs #247

albertedwardson · 2025-07-27T17:52:54Z

running

pytest --codspeed --codspeed-warmup-time=1 --codspeed-max-rounds=10000 --codspeed-max-time=10

on microopts branch

                                                   Benchmark Results                                                   
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃                                                        Benchmark ┃ Time (best) ┃ Rel. StdDev ┃ Run time ┃     Iters ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│          TestBenchmarkHttp::test_bench_dict_to_raw[long_headers] │         2ns │       15.3% │    9.54s │ 6,960,000 │
│   TestBenchmarkHttp::test_bench_dict_to_raw[many_unique_headers] │     2,765ns │        1.4% │   10.00s │   190,000 │
│ TestBenchmarkHttp::test_bench_dict_to_raw[many_repeated_headers] │       917ns │        3.6% │   10.00s │   330,000 │
│          TestBenchmarkHttp::test_bench_raw_to_dict[long_headers] │         2ns │        6.5% │    9.36s │ 6,360,000 │
│   TestBenchmarkHttp::test_bench_raw_to_dict[many_unique_headers] │     1,303ns │       10.4% │   10.00s │   280,000 │
│ TestBenchmarkHttp::test_bench_raw_to_dict[many_repeated_headers] │     1,120ns │        1.5% │   10.00s │   300,000 │
└──────────────────────────────────────────────────────────────────┴─────────────┴─────────────┴──────────┴───────────┘

on microopts branch commit 789e5de

                                                   Benchmark Results                                                   
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃                                                        Benchmark ┃ Time (best) ┃ Rel. StdDev ┃ Run time ┃     Iters ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│          TestBenchmarkHttp::test_bench_dict_to_raw[long_headers] │         3ns │        4.4% │    9.36s │ 5,590,000 │
│   TestBenchmarkHttp::test_bench_dict_to_raw[many_unique_headers] │     1,743ns │        1.4% │   10.00s │   240,000 │
│ TestBenchmarkHttp::test_bench_dict_to_raw[many_repeated_headers] │       441ns │        8.1% │    9.91s │   470,000 │
│          TestBenchmarkHttp::test_bench_raw_to_dict[long_headers] │         2ns │        4.3% │    9.59s │ 6,530,000 │
│   TestBenchmarkHttp::test_bench_raw_to_dict[many_unique_headers] │     1,298ns │        2.8% │   10.00s │   280,000 │
│ TestBenchmarkHttp::test_bench_raw_to_dict[many_repeated_headers] │     1,177ns │        1.7% │   10.00s │   290,000 │
└──────────────────────────────────────────────────────────────────┴─────────────┴─────────────┴──────────┴───────────┘

on main

                                                   Benchmark Results                                                   
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃                                                        Benchmark ┃ Time (best) ┃ Rel. StdDev ┃ Run time ┃     Iters ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│          TestBenchmarkHttp::test_bench_dict_to_raw[long_headers] │         4ns │        6.5% │    9.50s │ 4,810,000 │
│   TestBenchmarkHttp::test_bench_dict_to_raw[many_unique_headers] │     3,580ns │        7.2% │   10.00s │   170,000 │
│ TestBenchmarkHttp::test_bench_dict_to_raw[many_repeated_headers] │       251ns │       11.9% │   10.00s │   630,000 │
│          TestBenchmarkHttp::test_bench_raw_to_dict[long_headers] │         8ns │        3.7% │    9.67s │ 3,420,000 │
│   TestBenchmarkHttp::test_bench_raw_to_dict[many_unique_headers] │     1,918ns │        7.0% │   10.00s │   230,000 │
│ TestBenchmarkHttp::test_bench_raw_to_dict[many_repeated_headers] │     1,595ns │        7.7% │   10.00s │   250,000 │
└──────────────────────────────────────────────────────────────────┴─────────────┴─────────────┴──────────┴───────────┘

headers_dict_to_raw performance regressed in many_repeated_headers (e.g. cookies) case

I believe it`s cause of using bytes leads to frequent object recreation and copying, which is more expensive than growing a bytearray, especially when building larger payloads.

but headers_raw_to_dict shows improvements across all cases.

given that real-world headers are rarely tiny, I plan to stick with bytearray for now, as it provides more consistent performance under realistic conditions.

codecov · 2025-07-27T20:01:16Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.96%. Comparing base (f45e3ff) to head (d8f400b).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #247   +/-   ##
=======================================
  Coverage   97.96%   97.96%           
=======================================
  Files           9        9           
  Lines         491      491           
  Branches       83       83           
=======================================
  Hits          481      481           
  Misses          6        6           
  Partials        4        4

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

adding benchmarking, currently only for http funcs

6e44d36

albertedwardson added a commit to albertedwardson/w3lib that referenced this pull request Jul 27, 2025

stick with bytearray, see scrapy#247

a5d30c0

albertedwardson mentioned this pull request Jul 27, 2025

Small micro optimisations in w3lib.http #246

Merged

albertedwardson force-pushed the pytest-benchmark branch from b5fa1b2 to 6e44d36 Compare July 27, 2025 18:25

albertedwardson added 2 commits July 27, 2025 21:25

Merge branch 'scrapy:master' into pytest-benchmark

9671cf0

wip, will add new workflow

d8f400b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

adding benchmarking, currently only for http funcs #247

adding benchmarking, currently only for http funcs #247

albertedwardson commented Jul 27, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jul 27, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

adding benchmarking, currently only for http funcs #247

Are you sure you want to change the base?

adding benchmarking, currently only for http funcs #247

Conversation

albertedwardson commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

albertedwardson commented Jul 27, 2025 •

edited

Loading

codecov bot commented Jul 27, 2025 •

edited

Loading