-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polymorphize array::IntoIter
's iterator impl
#139430
base: master
Are you sure you want to change the base?
Conversation
rustbot has assigned @workingjubilee. Use |
// const-fold for certain widths. The `test_eight` case below shows that, yes, | ||
// what we're emitting *can* be const-folded, except that the way LLVM does it | ||
// for certain widths doesn't today. We should be able to put this back to | ||
// the same check after <https://github.com/llvm/llvm-project/issues/134513> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making the mention explicit: llvm/llvm-project#134513
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why match the undesired output though? Just to raise a signal if that changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly just to raise the signal, yeah, which is why I'm not trying to match it at all specifically.
But also that if we do change the library code in a way that doesn't optimize at all that'd still be good to catch. It's pretty unlikely, TBH, if we're also testing the -v1 version in the test.
I'd be fine to remove these checks too, though, if it's preferred. (Or just drop it to the masked.load
as a way to confirm it's still the same issue, or similar.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I cut it back to just the masked.load
and brought over the v1 case from #139503 -- please check that it still matches the intent you had.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My host ignore-x86_64
is also intended to exclude the case of "actually, the default cpu is x86-64-v3 and fails." So that gives explicit testing of x86_64 versions, and default for everything else.
Jubilee's already on three other PRs of mine, so let's spread the wealth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the approach -- shall we do a perf run though?
Sure, let's see what happens |
This comment has been minimized.
This comment has been minimized.
…, r=<try> Polymorphize `array::IntoIter`'s iterator impl Today we emit all the iterator methods for every different array width. That's wasteful since the actual array length never even comes into it -- the indices used are from the separate `alive: IndexRange` field, not even the `N` const param. This PR switches things so that an `array::IntoIter<T, N>` stores a `PolymorphicIter<[MaybeUninit<T>; N]>`, which we *unsize* to `PolymorphicIter<[MaybeUninit<T>]>` and call methods on that non-`Sized` type for all the iterator methods. That also necessarily makes the layout consistent between the different lengths of arrays, because of the unsizing. Compare that to today <https://rust.godbolt.org/z/Prb4xMPrb>, where different widths can't even be deduped because the offset to the indices is different for different array widths.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (23f2683): comparison URL. Overall result: ❌ regressions - please read the text belowBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 3.9%, secondary -5.6%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary -0.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary 0.1%, secondary -0.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 780.796s -> 777.474s (-0.43%) |
Today we emit all the iterator methods for every different array width. That's wasteful since the actual array length never even comes into it -- the indices used are from the separate
alive: IndexRange
field, not even theN
const param.This PR switches things so that an
array::IntoIter<T, N>
stores aPolymorphicIter<[MaybeUninit<T>; N]>
, which we unsize toPolymorphicIter<[MaybeUninit<T>]>
and call methods on that non-Sized
type for all the iterator methods.That also necessarily makes the layout consistent between the different lengths of arrays, because of the unsizing. Compare that to today https://rust.godbolt.org/z/Prb4xMPrb, where different widths can't even be deduped because the offset to the indices is different for different array widths.