Improve performance of `derangement`/`subfactorial` with iterative implementation #146

FedericoStra · 2023-12-10T19:05:47Z

This PR changes the implementation of derangement, hence also subfactorial,
to use the recursive formula !n = (n-1) * (!(n-1) + !(n-2)) presented here.

For values such as subfactorial(100) I get a huge speed-up, like ~10x.

codecov · 2023-12-10T19:07:29Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.93%. Comparing base (403dcb4) to head (e2b762b).
Report is 1 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #146      +/-   ##
==========================================
+ Coverage   96.89%   96.93%   +0.03%     
==========================================
  Files           8        8              
  Lines         805      815      +10     
==========================================
+ Hits          780      790      +10     
  Misses         25       25

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

FedericoStra · 2023-12-11T17:03:43Z

With this change I pass from

julia> @benchmark subfactorial(100)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  185.527 μs … 47.124 ms  ┊ GC (min … max):  0.00% … 60.54%
 Time  (median):     189.635 μs              ┊ GC (median):     0.00%
 Time  (mean ± σ):   267.191 μs ±  1.711 ms  ┊ GC (mean ± σ):  14.89% ±  2.31%

  ▆█▇▆▆▅▄▃▂▃▃▃▃▂▁▁▁▂▁▁▁                                        ▂
  █████████████████████▇▇▇▅▆▇▅▄▆▆█▇█▇▇▇▇▇▆▆▅▅▃▅▄▄▃▄▄▃▁▄▁▁▄▃▃▁▃ █
  186 μs        Histogram: log(frequency) by time       273 μs <

 Memory estimate: 81.06 KiB, allocs estimate: 2931.

to

julia> @benchmark subfactorial(100)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  12.113 μs …  49.734 ms  ┊ GC (min … max):  0.00% … 60.39%
 Time  (median):     15.437 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   29.177 μs ± 762.966 μs  ┊ GC (mean ± σ):  25.19% ±  0.96%

          ▂▅▆█▂  ▂▄     ▄▅▁
  ▁▁▁▁▁▁▂▄█████▆▅███▅▅▃▅███▆▅▅▄▃▃▂▂▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  12.1 μs         Histogram: frequency by time         24.2 μs <

 Memory estimate: 14.09 KiB, allocs estimate: 499.

FedericoStra · 2023-12-13T17:50:59Z

Changes

I replaced the recursive formula with the simpler !n = n * !(n-1) + (-1)^n.
I replaced all operations on BigInts with in-place functions from Base.GMP.MPZ to reduce allocations.

Consequences
The performance improved by another factor ~10x:

julia> @benchmark subfactorial(100)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.427 μs … 137.132 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.480 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.538 μs ±   1.549 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▃█▆▆▄▂                                                      ▂
  ██████▆▃▃▃▅██▇▆▆▅▃▄▄▄▁▃▄▄▃▁▄▃▃▁▄▄▁▁▄▁▁▃▁▁▃▄▁▃▁▁▁▁▃▃▁▁▃▁▁▁▃▅ █
  1.43 μs      Histogram: log(frequency) by time      3.05 μs <

 Memory estimate: 112 bytes, allocs estimate: 11.

test/factorials.jl

inkydragon

~~Need a rebase to master.~~
If CI looks good, I'll merge this pr.

FedericoStra · 2025-05-01T14:37:54Z

There seems to be a small regression in codecov/project stemming from "indirect changes" in src/combinations.jl. I'm not familiar with codevov so I don't know what this means exactly and how this PR affects code coverage in src/combinations.jl.

…plementation Use the recursive formula !n = (n-1) * (!(n-1) + !(n-2)) presented here: https://en.wikipedia.org/wiki/Derangement#Counting_derangements

…rsive formula and inplace computations Use the simpler formula !n = n * !(n-1) + (-1)^n and use inplace operations on `BigInt`s to avoid allocations.

FedericoStra · 2025-05-01T14:54:18Z

There seems to be a small regression in codecov/project stemming from "indirect changes" in src/combinations.jl. I'm not familiar with codevov so I don't know what this means exactly and how this PR affects code coverage in src/combinations.jl.

Ok, rebasing resolved the issue.

FedericoStra mentioned this pull request May 1, 2025

rel: new release v1.0.3 #177

Merged

inkydragon reviewed May 1, 2025

View reviewed changes

test/factorials.jl Show resolved Hide resolved

inkydragon approved these changes May 1, 2025

View reviewed changes

inkydragon added performance bignums BigInt and BigFloat labels May 1, 2025

inkydragon added this to the v1.0.3 milestone May 1, 2025

FedericoStra and others added 3 commits May 1, 2025 16:46

Improve performance of derangement/subfactorial with iterative im…

69570b8

…plementation Use the recursive formula !n = (n-1) * (!(n-1) + !(n-2)) presented here: https://en.wikipedia.org/wiki/Derangement#Counting_derangements

Improve performance of derangement/subfactorial with simpler recu…

6df8379

…rsive formula and inplace computations Use the simpler formula !n = n * !(n-1) + (-1)^n and use inplace operations on `BigInt`s to avoid allocations.

Update test/factorials.jl

e2b762b

FedericoStra force-pushed the master branch from 5dd8190 to e2b762b Compare May 1, 2025 14:47

inkydragon merged commit e3278db into JuliaMath:master May 2, 2025
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve performance of `derangement`/`subfactorial` with iterative implementation #146

Improve performance of `derangement`/`subfactorial` with iterative implementation #146

Uh oh!

FedericoStra commented Dec 10, 2023 •

edited

Loading

Uh oh!

codecov bot commented Dec 10, 2023 •

edited

Loading

Uh oh!

FedericoStra commented Dec 11, 2023

Uh oh!

FedericoStra commented Dec 13, 2023

Uh oh!

Uh oh!

inkydragon left a comment •

edited

Loading

Uh oh!

FedericoStra commented May 1, 2025

Uh oh!

FedericoStra commented May 1, 2025

Uh oh!

Uh oh!

Uh oh!

Improve performance of derangement/subfactorial with iterative implementation #146

Improve performance of derangement/subfactorial with iterative implementation #146

Uh oh!

Conversation

FedericoStra commented Dec 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Dec 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

FedericoStra commented Dec 11, 2023

Uh oh!

FedericoStra commented Dec 13, 2023

Uh oh!

Uh oh!

inkydragon left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FedericoStra commented May 1, 2025

Uh oh!

FedericoStra commented May 1, 2025

Uh oh!

Uh oh!

Uh oh!

Improve performance of `derangement`/`subfactorial` with iterative implementation #146

Improve performance of `derangement`/`subfactorial` with iterative implementation #146

FedericoStra commented Dec 10, 2023 •

edited

Loading

codecov bot commented Dec 10, 2023 •

edited

Loading

inkydragon left a comment •

edited

Loading