-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimize allocations when unpacking TimeZone
s from cache (updated)
#451
Conversation
- Should not allocate for cached FixedTimeZone - Should allocate only once for cached VariableTimeZone
Also, I've checked this against julia 1.10-rc1, and still see 2 allocs for constructing a |
I tried an alternative refactor, where I had a separate cache for all the classes (taking us to a total of 3) -- see on this branch. Whilst this reduces the number of allocations (down to 1 again for the VariableTimeZone), overall the timings are significantly worse, presumably due to the additional hash-table lookup. Microbenchmarks on my machine with Julia 1.10-rc1 for @btime TimeZone("UTC") # FTZ case
@btime TimeZone("America/Winnipeg") # VTZ case this PR
separate class cache
current master
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I want to sit with this for a little bit
src/types/timezone.jl
Outdated
return get(_FTZ_CACHE, name) do | ||
get(_VTZ_CACHE, name) do | ||
return f_default() | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be a possible faster alternative we where lookup the name to get the Class
which then lets us lookup the TimeZones
from its type specific Dict
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds a little like what I mentioned here: #451 (comment)
tidies up the allocations a bit, but the extra hash table lookup slowed things down in my timings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(just amended that comment to link to branch with the class-cache.)
Co-authored-by: Curtis Vogt <[email protected]>
Thanks - let me know if there’s anything else from my end that would be helpful. |
PkgBenchmark``` > TZDATA_VERSION=2016j julia --project=benchmark/ -e 'using PkgBenchmark, TimeZones; export_markdown(stdout, judge(TimeZones, "origin/HEAD", verbose=false))' PkgBenchmark: Running benchmarks... PkgBenchmark: using benchmark tuning data in /home/tom/.julia/dev/TimeZones/benchmark/tune.json Benchmarking 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:45 PkgBenchmark: Running benchmarks... PkgBenchmark: using benchmark tuning data in /home/tom/.julia/dev/TimeZones/benchmark/tune.json Benchmarking 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:47 ```Benchmark Report for /home/tom/.julia/dev/TimeZonesJob Properties
ResultsA ratio greater than
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfoTarget
Baseline
For good measure I did a second run, and stuff moved around quite a bit. So I'm not reading too much into any of these numbers:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @tpgillam! LGTM
I'd love us one day to get the VTZ case down to 1 or ideally 0 allocs (i was never sure what the last 1 alloc was tbh, but i think the alloc profiler now has info for all types whereas it didn't when i last tried). Anyway, this is already a big win for the most common case, and much appreciated!
(I'll leave it to @omus to decide when to merge and release, though i ofc hope it's soon!)
Thanks both for the discussions so far! @omus are you happy to move forward with this, or are there are other thoughts/concerns we should discuss? |
Bump! @omus please could you merge this and make a new release? |
Baseline timings for my machine. I'll need to resolve the conflict yet. Julia 1.10.0 Current julia> @btime TimeZone("America/Winnipeg");
41.246 ns (2 allocations: 64 bytes)
julia> @btime TimeZone("UTC");
40.868 ns (2 allocations: 64 bytes) This PR (2b0b76e) julia> @btime TimeZone("UTC");
17.994 ns (0 allocations: 0 bytes)
julia> @btime TimeZone("America/Winnipeg");
28.475 ns (2 allocations: 96 bytes) |
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #451 +/- ##
==========================================
- Coverage 92.79% 92.73% -0.07%
==========================================
Files 39 39
Lines 1818 1844 +26
==========================================
+ Hits 1687 1710 +23
- Misses 131 134 +3 ☔ View full report in Codecov by Sentry. |
This PR 511d2cc julia> @time_imports using TimeZones
0.4 ms Scratch
10.2 ms Preferences
0.4 ms PrecompileTools
┌ 0.0 ms Parsers.__init__()
45.3 ms Parsers 40.31% compilation time
4.2 ms InlineStrings
0.4 ms TZJData
0.5 ms Compat
0.2 ms Compat → CompatLinearAlgebraExt
0.4 ms ExprTools
0.6 ms Mocking
┌ 1.1 ms TimeZones.TZData.__init__()
├ 0.0 ms TimeZones.__init__()
22.6 ms TimeZones
julia> using BenchmarkTools
julia> @btime TimeZone("UTC");
18.203 ns (0 allocations: 0 bytes)
julia> @btime TimeZone("America/Winnipeg");
29.104 ns (2 allocations: 96 bytes)
julia> @btime istimezone("Europe/Warsaw");
76.418 ns (1 allocation: 48 bytes) I slight increase in allocated bytes with |
The cache code is definitely becoming unwieldy. I've created a This PR 67f2c3f julia> @time_imports using TimeZones
0.4 ms Scratch
9.1 ms Preferences
0.4 ms PrecompileTools
┌ 0.0 ms Parsers.__init__()
46.7 ms Parsers 38.23% compilation time
4.1 ms InlineStrings
0.3 ms TZJData
0.4 ms Compat
0.2 ms Compat → CompatLinearAlgebraExt
0.4 ms ExprTools
0.6 ms Mocking
┌ 1.0 ms TimeZones.TZData.__init__()
├ 0.0 ms TimeZones.__init__()
22.9 ms TimeZones
julia> using BenchmarkTools
julia> @btime TimeZone("UTC");
18.597 ns (0 allocations: 0 bytes)
julia> @btime TimeZone("America/Winnipeg");
28.894 ns (2 allocations: 96 bytes)
julia> @btime istimezone("Europe/Warsaw");
74.802 ns (1 allocation: 48 bytes) |
These changes are also a step in the right direction for reducing the initial cache loading time This PR 67f2c3f julia> @btime TimeZones._reload_tz_cache(TimeZones._COMPILED_DIR[]);
20.795 ms (313574 allocations: 12.52 MiB) Current julia> @btime TimeZones._reload_tz_cache(TimeZones._COMPILED_DIR[]);
22.808 ms (313621 allocations: 12.52 MiB) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tpgillam and @nickrobinson251 for this!
I'm planning on rolling some other changes in before making a release here. I expect TimeZones.jl 1.17 should be out by 2024-05-27 (Monday). |
Closes #422
This is a rebase/reimplementation of #423 onto
master
.(FYI @nickrobinson251, I'm doing this after seeing your comment!)
I started by trying to rebase your PR properly, but in the end it was easier to cherry-pick your commits that still applied (just the tests), then re-implement the rest. I'm sorry that this results in this new PR, but wasn't sure how else to proceed.
One note -- I find that it now takes two allocations to retrieve a
VariableTimeZone
from the cache, rather than 1. I spent some time with the alllocation tracker trying to understand this with no luck, so I've updated the test expectation accordingly. This isn't any worse than the current behaviour onmaster
, and in any case the main thing is that retrieving aFixedTimeZone
is now allocation free :)