Skip to content

Conversation

@DracoLi
Copy link
Contributor

@DracoLi DracoLi commented Nov 10, 2025

Why this should be merged

Moves most EVM block data out of the general-purpose KV store into dedicated height-indexed block databases optimized for block storage, reducing KV usage and compaction impact.

  • Headers, bodies, and receipts are stored in heightindexdb; all other data remains in the underlying KV store.
  • Observed ~20% reduction in KV (LevelDB/PebbleDB) size and reduced total compaction time.

How this works

The blockdb.Database wraps ethdb.Database and routes block data by key prefix, while leaving non-block data on the KV store.

  • Header/body/receipt keys for blocks with height ≥ the configured minimum are written to height-indexed DBs; below that threshold they use the KV store.
  • A single block per height is stored. Writing at an occupied height overwrites the previous value. Deletes of block data are no-ops.
  • If a block isn’t yet in the height-indexed DBs and migration isn’t complete, reads fall back to the KV store.
  • Batching: Block keys bypass the KV batch and write directly to the height-indexed DBs; non-block keys are batched on the underlying KV store.
  • Deferred init: With allowDeferredInit, initialization can wait until the minimum block height is known (e.g., via state sync). That minimum is then persisted and cannot be changed without recreating the block databases.

Migrations

Canonical block data is migrated from the KV store to the height-indexed databases in the background.

  • Migrates headers, bodies, and receipts for canonical blocks; genesis is skipped.
  • After migration, corresponding KV entries are deleted in batches; header/body/receipt ranges are periodically compacted.
  • Periodic logs report status and ETA.
  • Migration can be paused and resumed safely; restarts resume migration automatically.
  • Migrating ~71M blocks on a mainnet node took ~5 hours.

How this was tested

  • Unit tests for routing and migration behavior.
  • Multiple mainnet nodes with blockdb enabled running for weeks.
  • C-Chain migrations at 1M/10M/20M by re-execution and live-node migrations (including one with ~71M blocks).
  • A script comparing block data via EVM JSON-RPC between a blockdb node and a non-blockdb node.

Need to be documented in RELEASES.md?

No

@DracoLi DracoLi moved this to In Progress 🏗️ in avalanchego Nov 10, 2025
@DracoLi DracoLi self-assigned this Nov 10, 2025
@DracoLi DracoLi linked an issue Nov 12, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress 🏗️

Development

Successfully merging this pull request may close these issues.

Add evm database that supports separate storage for block data

2 participants