From c3350a0659839384e9dbb7d11685e237fa352a59 Mon Sep 17 00:00:00 2001 From: Jameson Nash Date: Tue, 30 Sep 2025 09:52:49 -0400 Subject: [PATCH 1/2] remove Base.memhash global This became unsound to use even though it was preserved "to avoid breakage" in v1.13, since continuing to use it would give incorrect hash results, which could result in corrupt dictionaries. Since #59691, these broken `hash` methods can now simply be deleted as they no longer provide any value. It is hard to say whether this is technically breaking or not as a change. It causes packages to go from giving subtly wrong answers (the worst kind of wrong) to crashing in v1.13, until the offending incorrect methods are deleted. --- NEWS.md | 3 ++- base/hashing.jl | 4 ---- 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/NEWS.md b/NEWS.md index e20f62d146e48..051aeb220f384 100644 --- a/NEWS.md +++ b/NEWS.md @@ -17,7 +17,8 @@ Language changes * `mod(x::AbstractFloat, -Inf)` now returns `x` (as long as `x` is finite), this aligns with C standard and is considered a bug fix ([#47102]) - - The `hash` algorithm and its values have changed. Most `hash` specializations will remain correct and require no action. Types that reimplement the core hashing logic independently, such as some third-party string packages do, may require a migration to the new algorithm. ([#57509]) + - The `hash` algorithm and its values have changed for certain types, most notably AbstractString. Any `hash` specializations for equal types to those that changed, such as some third-party string packages, may need to be deleted. ([#57509], [#59691]) + - The `hash(::AbstractString)` function is now a zero-copy / zero-cost function, based upon providing a correct implementation of the `codeunit` and `iterate` functions. Third-party string packages should migrate to the new algorithm by deleting their existing overrides of the `hash` function. ([#59691]) * Indexless `getindex` and `setindex!` (i.e. `A[]`) on `ReinterpretArray` now correctly throw a `BoundsError` when there is more than one element. ([#58814]) diff --git a/base/hashing.jl b/base/hashing.jl index 56c8c5c3b9e0f..2556c9265dbb4 100644 --- a/base/hashing.jl +++ b/base/hashing.jl @@ -629,7 +629,3 @@ hash(data::AbstractString, h::UInt) = hash_bytes(utf8units(data), UInt64(h), HASH_SECRET) % UInt @assume_effects :total hash(data::String, h::UInt) = GC.@preserve data hash_bytes(pointer(data), sizeof(data), UInt64(h), HASH_SECRET) % UInt - -# no longer used in Base, but a lot of packages access these internals -const memhash = UInt === UInt64 ? :memhash_seed : :memhash32_seed -const memhash_seed = UInt === UInt64 ? 0x71e729fd56419c81 : 0x56419c81 From 7033f5645d1c609e7d9a335730d1f6408b7692d8 Mon Sep 17 00:00:00 2001 From: Jameson Nash Date: Fri, 3 Oct 2025 11:30:19 -0400 Subject: [PATCH 2/2] Update NEWS.md --- NEWS.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/NEWS.md b/NEWS.md index 051aeb220f384..d6ecbf2762e14 100644 --- a/NEWS.md +++ b/NEWS.md @@ -17,8 +17,9 @@ Language changes * `mod(x::AbstractFloat, -Inf)` now returns `x` (as long as `x` is finite), this aligns with C standard and is considered a bug fix ([#47102]) - - The `hash` algorithm and its values have changed for certain types, most notably AbstractString. Any `hash` specializations for equal types to those that changed, such as some third-party string packages, may need to be deleted. ([#57509], [#59691]) - - The `hash(::AbstractString)` function is now a zero-copy / zero-cost function, based upon providing a correct implementation of the `codeunit` and `iterate` functions. Third-party string packages should migrate to the new algorithm by deleting their existing overrides of the `hash` function. ([#59691]) +* The `hash` algorithm and its values have changed for certain types, most notably AbstractString. Any `hash` specializations for equal types to those that changed, such as some third-party string packages, may need to be deleted. ([#57509], [#59691]) + +* The `hash(::AbstractString)` function is now a zero-copy / zero-cost function, based upon providing a correct implementation of the `codeunit` and `iterate` functions. Third-party string packages should migrate to the new algorithm by deleting their existing overrides of the `hash` function. ([#59691]) * Indexless `getindex` and `setindex!` (i.e. `A[]`) on `ReinterpretArray` now correctly throw a `BoundsError` when there is more than one element. ([#58814])