-
-
Notifications
You must be signed in to change notification settings - Fork 470
perf: optimize sample_floyd by unsafe APIs #1622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: optimize sample_floyd by unsafe APIs #1622
Conversation
2bdea23
to
07d4e92
Compare
Thanks for the PR. My main concern here is simply: should we be adding more CC @RalfJung |
Thanks for considering! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two unsafe operations here; I'd like to see the perf impact of each.
Not sure what exactly you want my input on here. :) Happy to consult on whether some use of unsafe is sound or not, but that doesn't seem to be the question here? As to whether you think the bit of unsafe is worth the perf gain -- that's a maintainer decision. There's absolutely cases where the perf gain is important enough to justify a bit of unsafe and there are other cases where it's not worth it. I don't have to maintain this code going forward so I can't make this decision for you. :)
Of course, testing != verification, so there could still be UB in edge cases not covered by the tests. |
Thank you for your review! Here are the benchmark results of using the unsafe functions. Only use
Additionally use
From my perspective, the elimination of bounds checking in |
Sorry for the delay; I finally got around to running benches on my 5800X desktop. This is 07d4e92 vs d468501. Full results
On average, that's +1% (range -11% to +71%). Yes, there are caveats to this type of benchmarking: variance (I repeated one test a few times and had less than 1% change so probably okay), relevance (and weighting), but on the available evidence I don't see any significant benefit to this change. |
CHANGELOG.md
entrySummary
This PR uses unsafe APIs to boost performance of
sample_floyd
. The optimization is totally safe because the index is bounded by the length of the vec.Motivation
Rust's bounds checking are sometimes unnecessary. Removing bounds checking by unsafe APIs can boost its performance.This optimization makes related functions more faster with safety ensured.
Details
The benchmark results from my environment is listed as below.