Skip to content

Improve performance of derangement/subfactorial with iterative implementation #146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 2, 2025

Conversation

FedericoStra
Copy link
Contributor

@FedericoStra FedericoStra commented Dec 10, 2023

This PR changes the implementation of derangement, hence also subfactorial,
to use the recursive formula !n = (n-1) * (!(n-1) + !(n-2)) presented here.

For values such as subfactorial(100) I get a huge speed-up, like ~10x.

Copy link

codecov bot commented Dec 10, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.93%. Comparing base (403dcb4) to head (e2b762b).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #146      +/-   ##
==========================================
+ Coverage   96.89%   96.93%   +0.03%     
==========================================
  Files           8        8              
  Lines         805      815      +10     
==========================================
+ Hits          780      790      +10     
  Misses         25       25              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@FedericoStra
Copy link
Contributor Author

With this change I pass from

julia> @benchmark subfactorial(100)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  185.527 μs … 47.124 ms  ┊ GC (min … max):  0.00% … 60.54%
 Time  (median):     189.635 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   267.191 μs ±  1.711 ms  ┊ GC (mean ± σ):  14.89% ±  2.31%

  ▆█▇▆▆▅▄▃▂▃▃▃▃▂▁▁▁▂▁▁▁                                        ▂
  █████████████████████▇▇▇▅▆▇▅▄▆▆█▇█▇▇▇▇▇▆▆▅▅▃▅▄▄▃▄▄▃▁▄▁▁▄▃▃▁▃ █
  186 μs        Histogram: log(frequency) by time       273 μs <

 Memory estimate: 81.06 KiB, allocs estimate: 2931.

to

julia> @benchmark subfactorial(100)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  12.113 μs …  49.734 ms  ┊ GC (min … max):  0.00% … 60.39%
 Time  (median):     15.437 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   29.177 μs ± 762.966 μs  ┊ GC (mean ± σ):  25.19% ±  0.96%

          ▂▅▆█▂  ▂▄     ▄▅▁
  ▁▁▁▁▁▁▂▄█████▆▅███▅▅▃▅███▆▅▅▄▃▃▂▂▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  12.1 μs         Histogram: frequency by time         24.2 μs <

 Memory estimate: 14.09 KiB, allocs estimate: 499.

@FedericoStra
Copy link
Contributor Author

Changes

  • I replaced the recursive formula with the simpler !n = n * !(n-1) + (-1)^n.
  • I replaced all operations on BigInts with in-place functions from Base.GMP.MPZ to reduce allocations.

Consequences
The performance improved by another factor ~10x:

julia> @benchmark subfactorial(100)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.427 μs … 137.132 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.480 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.538 μs ±   1.549 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▃█▆▆▄▂                                                      ▂
  ██████▆▃▃▃▅██▇▆▆▅▃▄▄▄▁▃▄▄▃▁▄▃▃▁▄▄▁▁▄▁▁▃▁▁▃▄▁▃▁▁▁▁▃▃▁▁▃▁▁▁▃▅ █
  1.43 μs      Histogram: log(frequency) by time      3.05 μs <

 Memory estimate: 112 bytes, allocs estimate: 11.

Copy link
Member

@inkydragon inkydragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a rebase to master.
If CI looks good, I'll merge this pr.

@inkydragon inkydragon added performance bignums BigInt and BigFloat labels May 1, 2025
@inkydragon inkydragon added this to the v1.0.3 milestone May 1, 2025
@FedericoStra
Copy link
Contributor Author

There seems to be a small regression in codecov/project stemming from "indirect changes" in src/combinations.jl. I'm not familiar with codevov so I don't know what this means exactly and how this PR affects code coverage in src/combinations.jl.

FedericoStra and others added 3 commits May 1, 2025 16:46
…plementation

Use the recursive formula
  !n = (n-1) * (!(n-1) + !(n-2))
presented here: https://en.wikipedia.org/wiki/Derangement#Counting_derangements
…rsive formula and inplace computations

Use the simpler formula

!n = n * !(n-1) + (-1)^n

and use inplace operations on `BigInt`s to avoid allocations.
@FedericoStra
Copy link
Contributor Author

There seems to be a small regression in codecov/project stemming from "indirect changes" in src/combinations.jl. I'm not familiar with codevov so I don't know what this means exactly and how this PR affects code coverage in src/combinations.jl.

Ok, rebasing resolved the issue.

@inkydragon inkydragon merged commit e3278db into JuliaMath:master May 2, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bignums BigInt and BigFloat performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants