Skip to content

Conversation

@Jasper-Bekkers
Copy link
Contributor

@Jasper-Bekkers Jasper-Bekkers commented Jun 3, 2025

Fixes #103
Sits on top of #114

Notice that non-determinism was already partially addressed by #104 which focused primarily on generating deterministic build outputs & reproducible binaries.

This PR goes one step further and also eliminates sources of non-determinism that happen during runtime. Our application entirely disallows HashMap and HashSet because it uses the system time to seed the random number generator that it uses for hashing and for us this often leads to non-reproducibility problems at runtime and are a frequent source of issues.

We ran into this situation because the i18n! macro expands to use a HashMap.

This PR makes it so that we change every HashMap into a DeterministicHashMap which uses siphash as it's hashing function instead (I chose siphash so that I didn't have to introduce another dependency to this crate as it's already in use).

Notice that this PR doesn't undo the BTreeMap conversion from #104 since I had a hard time generating a deterministic build output on my Windows machine; but with the new DeterministicHashMap one should be able to convert it into that an potentially regain any (probably minor) performance regressions introduced with the BTreeMap switch.

@huacnlee huacnlee requested review from Copilot and sunli829 July 15, 2025 01:42
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR ensures runtime determinism by replacing all uses of HashMap/HashSet with a DeterministicHashMap (using siphash) and updating the Backend API to return Cow<'_, str> for translations.

  • Introduce and re-export DeterministicHashMap in the support crate.
  • Change the Backend trait and related implementations (BackendExt, SimpleBackend) to return Cow<'_, str>.
  • Update all map initializations in tests, examples, macros, benches, and CLI to use DeterministicHashMap.

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/multi_threading.rs Adjusted locale argument to borrow a &str explicitly
tests/integration_tests.rs Swapped HashMap for DeterministicHashMap and updated return types to Cow<'_, str>
src/lib.rs Re-exported DeterministicHashMap in the public API
examples/share-in-workspace/crates/i18n/src/lib.rs Updated Backend impl to use Cow<'_, str>
examples/app-egui/src/main.rs Changed .selectable_value() to clone the locale string
crates/support/src/lib.rs Added mod deterministic; and re-exported DeterministicHashMap
crates/support/src/deterministic.rs Defined DeterministicHashMap type alias using siphasher
crates/support/src/backend.rs Modified Backend trait and SimpleBackend to use Cow<'_, str>
crates/macro/src/lib.rs Adjusted macro code to use DeterministicHashMap and fully-qualified Cow
crates/extract/src/generator.rs Updated Translations alias and map initialization to deterministic maps
crates/extract/src/extractor.rs Swapped HashMap for DeterministicHashMap in extraction results
crates/cli/src/main.rs Replaced HashMap with DeterministicHashMap in CLI logic
clippy.toml Disallowed HashMap/HashSet to enforce deterministic maps
benches/bench.rs Defined a deterministic DICT and benchmarked t!
README.md Updated examples to use DeterministicHashMap and Cow<'_, str>
Comments suppressed due to low confidence (5)

crates/support/src/deterministic.rs:10

  • The doc comment reads “is the asset build pipeline” but should read “in the asset build pipeline” for clarity.
/// Helper type to replace [`std::collections::hash_map::RandomState`] is the asset

crates/support/src/backend.rs:17

  • Changing the Backend trait’s return type from Vec<&str> to Vec<Cow<'_, str>> is a breaking API change; please document this in the CHANGELOG and consider a major version bump.
    {

benches/bench.rs:1

  • [nitpick] The DeterministicHashMap import is not used in this file—consider removing it or using it in the benchmark to keep imports tidy.
use rust_i18n::{t, DeterministicHashMap};

benches/bench.rs:8

  • [nitpick] The static DICT is defined but never used in the benchmark—consider removing it or adding a test that exercises it.
pub static ref DICT: DeterministicHashMap<&'static str, &'static str> =

tests/multi_threading.rs:62

  • [nitpick] The extra borrow on locales[m] produces a &&str; you can drop the & and pass locales[m] directly for clearer code.
                            t!("hello", locale = &locales[m]);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

non-deterministic builds

1 participant