perf: pre-initialized Blake2b cloning for hash hot paths#3089
Draft
wbbradley wants to merge 1 commit intowbbradley/further-investigationfrom
Draft
perf: pre-initialized Blake2b cloning for hash hot paths#3089wbbradley wants to merge 1 commit intowbbradley/further-investigationfrom
wbbradley wants to merge 1 commit intowbbradley/further-investigationfrom
Conversation
7d38f81 to
9d1f54f
Compare
4fb1826 to
6ee5104
Compare
9d1f54f to
f0f01c0
Compare
6ee5104 to
3fd08b0
Compare
f0dc785 to
b7f6cc2
Compare
3fd08b0 to
a6c5085
Compare
b7f6cc2 to
c7a8cc4
Compare
a6c5085 to
606c1c1
Compare
c7a8cc4 to
b21a0fb
Compare
606c1c1 to
f9e63d8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Every
leaf_hashandinner_hashcall creates a fresh Blake2b256 hasher throughnew_with_params, which runs IV setup and parameter validation. With ~3M hash calls per 32 MiB encode (1M leaf + ~2M inner), this accounts for 1.1% of self-time in profiling.This change bypasses
HashFunctionWrapperin hot paths by working withblake2::Blake2b<U32>directly. Two staticLazyLockhashers are pre-initialized with the leaf/inner prefix already fed in. Each hash call clones the pre-initialized state (~200 byte memcpy) instead of running full construction.Adds
leaf_hash_blake2b256andinner_hash_blake2b256as specialized fast-path functions, andMerkleTree::build_from_leaf_hashes_fastfor Merkle tree construction. The genericleaf_hash<T>andinner_hash<T>remain unchanged for non-hot-path usage (proofs, generic tree builds).Test plan
profile_encoding; results are within noise of the previous commit, consistent with the small (~1%) targeted overhead.Release notes