-
Notifications
You must be signed in to change notification settings - Fork 321
into_group_map: make code more idiomatic #1027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
into_group_map: make code more idiomatic #1027
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1027 +/- ##
==========================================
+ Coverage 94.38% 94.46% +0.07%
==========================================
Files 48 50 +2
Lines 6665 6812 +147
==========================================
+ Hits 6291 6435 +144
- Misses 374 377 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Hi there, thanks for this. I'd like to take the But I want to double check |
Is there anything you expect of me for this to be merged? In case you do not want to use fold after deciding on the mentioned issue, i can quickly roll the change to fold back so this can be merged. |
What i'm asking is, would you like me to benchmark this for some different cases? |
I'd say keep the It's quite hard to mircobenchmark these sorts of things, since they often depend on how much gas the optimizer has in particular contexts. I don't think it's worth the effort. I'd rather just use the thing that probably optimizes well and isn't really any harder to read. |
cad6e3d
to
a7b2f9c
Compare
Yes. So, I dove a bit deeper into for_each vs fold. I wrote an example in compiler explorer It seems like the for_each keeps more of the iteration state hot in registers while the fold spills much more into the stack and reloads half a dozen registers every trip through the loop, so it executes ~35% more instructions per loop, just considering the iteration skeleton (rough estimate, i didn't pull out a calculator, and this also doesn't factor in cpu internal instruction cycle times) and performs a lot more memory traffic. That's just about how much I could get from this. I'm not an assembly god. Though it seems like just running with the for_each but using default is the go-to until maybe compiler optimizations some day say otherwise. Of course this also doesn't include different use cases with different concrete types. This is just a naive, simple example. The inner part of the loop is largely the same. |
a7b2f9c
to
51e9f4a
Compare
This shouldn't change any runtime. I just noticed this while using the function, as i replaced exactly this (new) code, with the itertools version and wanted to contribute it back.