-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speed up String::push
and String::insert
#124810
base: master
Are you sure you want to change the base?
speed up String::push
and String::insert
#124810
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @scottmcm (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a variety of thoughts; let me know what you think.
Also, is there anything here for which it would make sense to have a codegen test to confirm what's happening? Or some other test to help confirm it's better?
A codegen check for the absence of |
This comment has been minimized.
This comment has been minimized.
This comment was marked as outdated.
This comment was marked as outdated.
This comment has been minimized.
This comment has been minimized.
This comment was marked as outdated.
This comment was marked as outdated.
There are merge commits (commits with multiple parents) in your changes. We have a no merge policy so these commits will need to be removed for this pull request to be merged. You can start a rebase with the following commands:
The following commits are merge commits: |
9511918
to
89fa55e
Compare
This comment was marked as outdated.
This comment was marked as outdated.
89fa55e
to
2cb20b3
Compare
This comment was marked as outdated.
This comment was marked as outdated.
The proposed implementation uses |
9462d92
to
a2effde
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@rustbot review |
This comment was marked as outdated.
This comment was marked as outdated.
49bd467
to
b7f3a92
Compare
This comment was marked as outdated.
This comment was marked as outdated.
b7f3a92
to
1297cd0
Compare
This comment was marked as outdated.
This comment was marked as outdated.
1297cd0
to
9cf92e5
Compare
@scottmcm are you able to review this? I could probably take a look if not. |
I'll at least get this going @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…ng-insert, r=<try> speed up `String::push` and `String::insert` Addresses the concerns described in rust-lang#116235. The performance gain comes mainly from avoiding temporary buffers. Complex pattern matching in `encode_utf8` (introduced in rust-lang#67569) has been simplified to a comparison and an exhaustive `match` in the `encode_utf8_raw_unchecked` helper function. It takes a slice of `MaybeUninit<u8>` because otherwise we'd have to construct a normal slice to uninitialized data, which is not desirable, I guess. Several functions still have that [unneeded zeroing](https://rust.godbolt.org/z/5oKfMPo7j), but a single instruction is not that important, I guess. `@rustbot` label T-libs C-optimization A-str
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (26f0bba): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 3.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary 2.4%, secondary -2.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary -0.1%, secondary -0.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 774.477s -> 773.266s (-0.16%) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry this has been sitting for so long, I have one last question then I think we can merge this. Mind posting the results of library/alloctests/benches/string.rs if you have run those?
#[doc(hidden)] | ||
#[inline] | ||
#[cfg_attr(bootstrap, rustc_allow_const_fn_unstable(const_mut_refs))] | ||
pub const unsafe fn encode_utf8_raw_unchecked(code: u32, dst: *mut u8) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been long enough that I'm forgetting context here, but why was this changed away from MaybeUninit
? Specifically thinking of a signature like
pub const unsafe fn encode_utf8_raw_unchecked(
code: u32, dst: &mut [MaybeUninit<u8>]
) -> &mut [u8] {
// Write the characters then call MaybeUninit::assume_init_ref
}
Then lengths get checked and push
becomes slightly simpler with core::char::encode_utf8_raw_unchecked(ch as u32, self.vec.spare_capacity_mut())
(maybe needs an assert_unchecked(self.buf.capacity() - self.len > len)
if LLVM doesn't pick up on that).
Addresses the concerns described in #116235.
The performance gain comes mainly from avoiding temporary buffers.
Complex pattern matching in
encode_utf8
(introduced in #67569) has been simplified to a comparison and an exhaustivematch
in theencode_utf8_raw_unchecked
helper function. It takes a slice ofMaybeUninit<u8>
because otherwise we'd have to construct a normal slice to uninitialized data, which is not desirable, I guess.Several functions still have that unneeded zeroing, but a single instruction is not that important, I guess.
@rustbot label T-libs C-optimization A-str