DB migration plan

# Local Statesync Plan

## Context

Live migration: a temporary migration DB syncs from the live execution DB via statesync, then cuts over. Both DBs share the same storage device. After cutover, the migration DB becomes primary and the old DB's seq chunks are recycled into the shared pool. Each DB needs its own metadata/root-offset backing and its own owned set of seq chunks, but seq ownership can be dynamic through the shared pool. The design should reuse the existing device/chunking model instead of repartitioning the whole pool from scratch for every migration.

## Phase 0: Decouple DB Metadata Backing From Fixed CNV Chunk Positions

The current code assumes the first conventional chunks have fixed meanings:
- `AsyncIO` hardcodes CNV chunk 0 as the metadata chunk
- `UpdateAux` hardcodes CNV chunk 0 for `db_metadata` and CNV chunk 1+ for root offsets
- the CLI restore/import path also hardcodes CNV chunk 0

This is too coupled to the old single-DB format. Before multi-DB support, we should make "where this DB's metadata/root offsets live" an explicit backing choice rather than "whichever CNV chunks happen to be first in the pool."

Phase 0 outcome:
- CNV chunks do not need to live at special positions in `storage_pool`
- `db_id=1` can still point at the legacy on-pool backing
- `db_id=2` can point at sidecar-backed metadata/root offsets
- all open paths resolve metadata/root-offset backing explicitly

### Phase 0 Non-Goals

Phase 0 is not the shared-free-list refactor. Do not touch seq allocation policy yet:
- do not change `fast` / `slow` / `free_list` behavior yet
- do not add the global pool freelist yet
- do not add dynamic seq ownership transfer yet
- do not repartition the pool footer or change chunk geometry

The goal is only to remove the positional assumptions around metadata/root-offset backing.

### Cleanest Implementation Boundary

The clean cut is:
- make `AsyncIO` effectively seq-only
- move metadata/root-offset backing selection into MPT open/setup code
- make `UpdateAux::set_io()` map metadata/root-offsets from an explicit backing descriptor rather than inferring them from `pool.chunk(cnv, 0)` and `pool.chunk(cnv, 1+)`

Important observation from the current code:
- `AsyncIO` only touches CNV in its constructor to register chunk-0 FDs with io_uring
- metadata/root-offset access is not performed through async reads/writes; `UpdateAux::set_io()` mmaps those backings directly

That means Phase 0 should not try to teach `storage_pool` about multiple logical DBs yet. The smallest clean refactor is to resolve DB backing once during open, then pass that resolved backing into `UpdateAux`.

### Recommended Backing Model

Add an explicit backing descriptor, resolved once per DB open:

```cpp
struct resolved_db_backing
{
    uint8_t db_id; // 1 = legacy primary, 2 = migration DB

    struct mapped_chunk_ref
    {
        int read_fd;
        int write_fd;
        uint64_t base_offset;
        uint64_t capacity;
        uint32_t logical_id;
    };

    mapped_chunk_ref metadata_chunk;                  // holds both metadata copies
    std::vector<mapped_chunk_ref> root_offset_chunks; // each holds both copies
};
```

Two backing modes are enough for v1:
- legacy pool-backed DB 1:
  - metadata chunk = pool CNV chunk 0
  - root-offset chunks = pool CNV chunks 1..N
- sidecar-backed DB 2:
  - metadata chunk = dedicated sidecar file
  - root-offset chunks = dedicated sidecar files

Do not add an arbitrary CNV-placement feature in Phase 0. Sidecar-backed DB 2 is enough to break the hardcoded single-DB assumption cleanly.

### Preserve The Existing On-Disk Copy Layout

Do not redesign the metadata format in Phase 0. Reuse the current "two copies in one CNV-sized backing object" layout:
- metadata backing stores copy A in the first half, copy B in the second half
- each root-offset backing does the same

For sidecar-backed DB 2, size each sidecar file exactly like one CNV chunk on the pool and keep the same half-and-half layout. This keeps the mmap logic almost identical to today.

### Do Not Treat `root_offsets.storage_.cnv_chunk_id` As A Physical Pool Chunk ID Anymore

This field should become a logical backing ID:
- for legacy DB 1, the logical IDs still happen to be `1..N` and resolve to pool CNV chunks
- for sidecar DB 2, the logical IDs also use `1..N` but resolve to sidecar root-offset files

That avoids a db-metadata format change in Phase 0. `UpdateAux::map_root_offsets()` should resolve through `resolved_db_backing.root_offset_chunks`, not by directly calling `pool.chunk(storage_pool::cnv, stored_id)`.

### Exact Phase 0 Code Changes

#### 0.1 Add an explicit backing spec to config

File:
- `category/mpt/ondisk_db_config.hpp`

Add:
- `uint8_t db_id{1};`
- a small optional backing override for non-legacy DBs, e.g.:

```cpp
struct DbBackingPaths
{
    std::filesystem::path metadata_path;
    std::vector<std::filesystem::path> root_offset_paths;
};
```

Then:
- `OnDiskDbConfig` gets `db_id` and `std::optional<DbBackingPaths> backing_paths`
- `ReadOnlyOnDiskDbConfig` gets the same

Rule:
- `db_id=1` with no `backing_paths` means legacy pool-backed DB
- `db_id=2` must provide `backing_paths`
- `root_offsets_chunk_count` only affects pool creation for the legacy pool-backed case; for sidecar-backed DB 2 it controls how many sidecar root-offset files to create, not how many CNV chunks the pool has

#### 0.2 Resolve backing once in `AsyncIOContext`

Files:
- `category/mpt/db.hpp`
- `category/mpt/db.cpp`
- new helper file is recommended, e.g. `category/mpt/db_backing.hpp/.cpp`

Add:
- `resolved_db_backing backing;` to `AsyncIOContext`
- helper `resolve_db_backing(storage_pool&, options)` returning `resolved_db_backing`

`resolve_db_backing(...)` should also own sidecar lifecycle for writable opens:
- create/truncate sidecar metadata/root-offset files when opening a fresh writable DB 2
- reopen existing sidecar files for append/open-existing mode
- size every sidecar file to exactly one CNV chunk capacity

Why here:
- every open path already funnels through `AsyncIOContext`
- once `AsyncIOContext` owns the resolved backing, both RO and RW worker-thread paths inherit it automatically
- this is also where the current `pool_options.num_cnv_chunks = root_offsets_chunk_count + 1` logic must become conditional on using the legacy pool-backed DB format

#### 0.3 Stop making `AsyncIO` responsible for metadata backing

Files:
- `category/async/io.hpp`
- `category/async/io.cpp`

Change:
- remove `cnv_chunk_` from `AsyncIO`
- stop registering CNV chunk 0 with io_uring
- keep `AsyncIO` focused on seq chunks only

This is the cleanest boundary because no async code currently uses CNV for metadata/root-offset access after construction.

#### 0.4 Change `UpdateAux::set_io()` to take the resolved backing

Files:
- `category/mpt/trie.hpp`
- `category/mpt/update_aux.cpp`

Signature change:

```cpp
void set_io(
    AsyncIO &,
    resolved_db_backing const &,
    std::optional<uint64_t> history_length = {});
```

Constructor overloads should follow the same pattern.

Implementation split in `update_aux.cpp`:
- `map_db_metadata_from_backing(...)`
- `map_root_offsets_from_backing(...)`
- `initialize_new_db_from_backing(...)`

Specific replacements inside current `set_io()`:
- replace `pool.chunk(cnv, 0)` with `backing.metadata_chunk`
- replace the current `for each stored cnv_chunk_id -> pool.chunk(cnv, id)` logic with lookup in `backing.root_offset_chunks`
- on fresh DB init, allocate/zero the configured root-offset backings instead of hardcoding CNV chunks `1..N`

#### 0.5 Keep fresh DB initialization identical except for backing selection

In `UpdateAux::set_io()`:
- keep the double-copy metadata initialization
- keep the history/ring sizing rules
- keep the existing free/fast/slow initialization
- only change where metadata/root-offset bytes are written

That means Claude should avoid touching:
- list semantics
- chunk insertion order
- `capacity_in_free_list`
- seq-chunk initialization

#### 0.6 Update every constructor path that builds `UpdateAux`

Files:
- `category/mpt/db.cpp`

Call sites that must pass the new backing:
- `Db::ROOnDiskBlocking`
- `OnDiskWithWorkerThreadImpl::DbAsyncWorker` (both RO and RW constructors)
- any direct `UpdateAux` construction in tests/tooling

This is the spot where hidden default-DB paths usually survive, so Phase 0 is not done until every one of these passes the same resolved backing.

#### 0.7 Update CLI restore/import to use the same backing helpers

File:
- `category/mpt/cli_tool_impl.cpp`

Minimum requirement for Phase 0:
- remove direct `pool->chunk(cnv, 0)` metadata mapping
- use the same metadata-backing resolution helper used by `AsyncIOContext`

It is okay if full archive/export support for sidecar-backed DB 2 is deferred, but restore/open code must stop assuming metadata always lives in pool CNV chunk 0.

### Phase 0 Validation

Add or update tests for these exact cases:
- legacy DB 1 still initializes and reopens with pool CNV chunk 0 metadata and pool CNV `1..N` root offsets
- sidecar-backed DB 2 initializes and reopens using sidecar metadata/root-offset files
- RO open path uses the same resolved backing as RW open path
- worker-thread opens (`RODb`, RW worker) use the same resolved backing as direct opens
- `UpdateAux::set_io()` no longer contains any hardcoded `pool.chunk(storage_pool::cnv, 0)` or `for (n = 2; ...) pool.chunk(storage_pool::cnv, n)` assumptions

## Phase 1: Shared Pool Free List

Extend the existing `storage_pool` chunk management (not a new system) with a global seq-chunk free list.

Important design correction: split responsibilities cleanly. Pool metadata owns the global free list of unowned chunks. Triedb `db_metadata` remains durable mmapped state for per-DB placement and reuse of chunks already held by that DB. `fast`/`slow` stay DB-local implementation details, and the current DB-local `free_list` is repurposed as a DB-local recycle/reserve list rather than a global free-space list. Ownership is implicit: if a seq chunk is on the pool free list, it is unowned; otherwise it belongs to some DB.

### Implicit Ownership

If a chunk is in the global free list, it is unowned. If it is not in the global free list, it is owned by some DB. Within that DB, the chunk may be active in `fast`/`slow` or sitting on the DB-local recycle list ready for reuse.

### Global Freelist: Pool-Level Lock + Index-Linked List

```cpp
struct pool_freelist {
    std::atomic<uint8_t> lock;         // small pool-level spinlock or mutex wrapper
    uint16_t head;                     // first free chunk index (0xFFFF = empty)
    uint16_t next_free[MAX_CHUNKS];   // singly-linked list via indices
};
```

This lock only protects pool free-list transitions. In the common single-DB case it is touched only when a chunk is acquired from or returned to the global free list, so steady-state overhead should be close to zero.

Practical note for v1:
- use a simple sidecar spinlock or mutex byte; do not try to invent a lock-free allocator here
- pool free-list operations are not on the hot path of every node write
- the hot path should stay DB-local through the recycle list

### DB Identity and Metadata Backing

Do not add a pool catalog. Existing pools derive chunk geometry from the current footer layout, and existing pools cannot grow `num_cnv_chunks` in place. For the same reason, do not extend the existing pool footer with the new global free-list structure on existing pools. Instead, v1 uses:
- the existing pool footer unchanged
- a small sidecar file for global free-list metadata
- a fixed DB convention:

```cpp
struct db_open_spec {
    uint8_t db_id;                     // 1 = existing primary, 2 = migration DB
    std::span<std::filesystem::path const> metadata_backing_paths;
    std::optional<uint32_t> chunk_limit;
};
```

V1 assumption:
- `db_id=1` uses the existing on-pool metadata/root-offset backing
- `db_id=2` uses explicitly provided sidecar-backed metadata/root-offset files
- callers must pass `db_id` through every open path
- cutover updates the process/config that chooses which DB backing is treated as primary

### Global Free-List Recovery

The sidecar free-list metadata is authoritative during normal runtime, but it must be rebuildable. On migration-tool startup, or after any detected dirty/corrupt sidecar state:
1. Open both DB metadata backings under exclusive migration lock
2. Walk each DB's `fast`/`slow`/local-recycle lists
3. Mark those chunks as in-use
4. Rebuild the global pool free list from the remaining seq chunks

This gives a simple crash-recovery rule without adding explicit owner arrays.

Recovery assumptions for Claude:
- the sidecar is authoritative only while it is clean
- if the sidecar is missing, dirty, or fails validation, rebuild it from DB metadata instead of trying to repair it incrementally
- rebuild requires exclusive migration/process ownership of both DB handles for the duration of the scan

### Operations

**Allocate chunk for DB X**:
1. Acquire the pool free-list lock
2. Pop `head`
3. Release the lock
4. Hand the chunk to DB X, which may place it directly into `fast`/`slow` or park it on the DB-local recycle list first

**Reuse chunk inside DB X**:
1. Prefer popping from DB X's local recycle list
2. If empty and `chunk_limit` allows, allocate from the global pool free list
3. If empty and `chunk_limit` would be exceeded, compact / reclaim within DB X and retry the local recycle list before touching the pool again

**Free chunk from DB X**:
1. Remove chunk from DB X's `fast`/`slow` metadata
2. Trim/destroy chunk contents
3. Return it to DB X's local recycle list by default
4. Only explicit shrink/destroy paths return chunks to the global pool free list

**Destroy DB X** (bulk reclaim):
1. DB X must already be quiesced and all DB handles closed
2. Acquire the pool free-list lock
3. Walk DB X's owned seq chunks, trim them, remove them from DB X metadata, and push them onto the global free list
4. Release the lock

**Bootstrap DB 2 on an existing pool**:
1. Quiesce DB 1 and close helper handles long enough to take a consistent snapshot
2. Reinterpret DB 1's existing `free_list` as DB 1's local recycle list
3. Carve out an initial budget for DB 2 by moving selected chunks from DB 1's recycle list into the global pool free list, then letting DB 2 allocate them
4. Create DB 2 sidecar metadata/root-offset backing
5. Open DB 1 with `db_id=1` and DB 2 with `db_id=2`
6. From this point on, each DB prefers its own recycle list; the pool is used when a DB needs new unowned chunks

Bootstrap assumptions:
- v1 does not need a perfect automatic rebalance policy; a fixed initial carve-out is enough
- carve-out should only move chunks that are already empty and on DB 1's recycle list
- do not steal active `fast`/`slow` chunks during bootstrap
- if the first deployment is allowed to hard-reset the pool, this entire bootstrap path can be skipped and both DBs can start from a freshly initialized layout

**Cutover** (migration complete):
1. Stop source execution writes and stop issuing new statesync work
2. Drain in-flight statesync/server reads, then close source RW/RO handles (`TrieDb`, source `Db`, `sctx.ro`, server/client contexts)
3. Reopen DB B as the live execution DB / make it the configured primary
4. Destroy DB A — all its seq chunks return to the global pool free list
5. DB B can now allocate from the expanded free list

### Per-DB Size Configuration

`chunk_limit` is a per-DB policy. It can live in config and/or be mirrored in DB metadata for observability. When a DB approaches its limit, the policy should prefer compaction and reuse from that DB's recycle list instead of pulling new chunks from the global pool. The temporary migration DB can be capped while the old primary is live. After cutover, the promoted DB just allocates from the same global free list without the cap.

CNV backing is separate and fixed at DB creation time. Existing pools cannot grow `num_cnv_chunks` in place, so v1 still must not depend on repartitioning the pool's CNV region. For existing pools, the current primary DB keeps its existing CNV assignment and the temporary migration DB uses dedicated sidecar-backed metadata/root-offset storage on the same device.

### On-Disk Layout

```
Device:
┌─────────────────────────────────────────────────┐
│ Existing pool layout remains in place             │
│   DB 1: existing cnv assignment                  │
│   DB 2: sidecar metadata/root offsets (v1)       │
├─────────────────────────────────────────────────┤
│ SEQ chunks                                        │
│   Interleaved across both DBs                    │
│   ownership implicit from pool free list vs DB use│
│   DB-local recycle lists stay in DB metadata      │
├─────────────────────────────────────────────────┤
│ chunk_bytes_used[N]   (existing, atomic<uint32>) │
│ metadata_t            (existing, 64 bytes)       │
├─────────────────────────────────────────────────┤
│ Pool freelist sidecar  (NEW, head + next_free)   │
└─────────────────────────────────────────────────┘
```

### Changes to Existing Code

| File | Change | Effort |
|------|--------|--------|
| `async/storage_pool.hpp` | Add pool-level global free-list helpers plus `allocate_chunk(db_id)` / `free_chunk_to_pool(chunk)` APIs. Do not add DB-placement semantics here. | Medium |
| `async/storage_pool.cpp` | Implement global free-list operations under the pool lock, bootstrap migration from the existing DB-local free list, and final reclaim back into the pool. Keep the existing pool footer unchanged. | Medium |
| `async/io.hpp` / `async/io.cpp` | Make AsyncIO use the explicit `db_id`/path-selected metadata backing instead of hardcoding CNV chunk 0. | Medium |
| `mpt/db.hpp` | `AsyncIOContext` accepts optional shared `storage_pool*`, and all open-path config structs carry `db_id`. | Small |
| `mpt/db.cpp` | `AsyncIOContext`, `Db`, and `RODb` constructors: use shared pool if provided and resolve metadata backing from explicit `db_id`/path convention. | Small |
| `mpt/db.cpp` (DbAsyncWorker) | Forward shared pool and `db_id` to worker threads so helper opens do not fall back to the default DB. | Small |
| `mpt/update_aux.cpp` | `set_io()` reads metadata backing from explicit `db_id`/path convention instead of hardcoding chunk 0. `map_root_offsets()` uses the configured backing. Keep a DB-local recycle list in `db_metadata` for owned reusable chunks; `fast`/`slow` remain in `db_metadata`, and new chunks come from the pool only when the recycle list is empty and policy allows. | Large |
| `mpt/ondisk_db_config.hpp` | Add `db_id` and optional `chunk_limit` to both `OnDiskDbConfig` and `ReadOnlyOnDiskDbConfig`. Valid DB IDs start at 1, and callers must pass it explicitly for every migration-related open. | Trivial |
| `mpt/cli_tool_impl.cpp` | Remove remaining hardcoded CNV chunk 0 assumptions in archive/import tooling so tooling can open non-primary DB slots correctly. | Small |
| `cmd/monad_local_statesync.cpp` | Hold the external migration lock, rebuild allocator state on open, drive statesync, promote migration DB, then reclaim old primary. Optionally host DB 1 + DB 2 on one migration-owned worker thread. | Medium |
| `cmd/monad_pool_freelist_sidecar.*` (new) | Read/write the global free-list sidecar and rebuild it from DB metadata when dirty or missing. | Medium |

### Phase 1 Assumptions To Keep Tight

- Do not redesign `db_metadata` in Phase 1. Keep the list structure and chunk-info entries, but reinterpret `free_list` as DB-local recycle.
- Do not make `storage_pool` aware of `fast`/`slow`.
- Do not try to support arbitrary live chunk migration between DBs yet; only pool allocation, local reuse, bootstrap carve-out, and bulk reclaim.
- Keep the sidecar free-list format boring: header + head + next array + dirty bit/version.

### Backwards Compatibility

Single-DB: use `db_id=1`. The global pool lock is only touched when a chunk is acquired or returned, so steady-state single-DB execution should see near-zero overhead. Behavior is otherwise identical to current code.

### Suggested Implementation Steps

#### Phase 0A: Introduce explicit metadata/root-offset backing selection

Goal: remove the hardcoded "CNV chunk 0 / CNV chunk 1+" assumptions before touching allocation.

Scope:
- `async/io.hpp`
- `async/io.cpp`
- `mpt/db.hpp`
- `mpt/db.cpp`
- `mpt/update_aux.cpp`
- `mpt/ondisk_db_config.hpp`
- `mpt/cli_tool_impl.cpp`

Behavior:
- add explicit `db_id`
- make metadata/root-offset backing explicit per DB open
- keep single-DB behavior unchanged for `db_id=1`
- no shared freelist yet

#### Phase 1A: Add global pool free-list sidecar for the existing single DB

Goal: introduce the pool-level free-list sidecar and recovery logic without changing open-path identity or statesync.

Scope:
- `async/storage_pool.hpp`
- `async/storage_pool.cpp`
- `cmd/monad_pool_freelist_sidecar.*`
- `async/test/storage_pool.cpp`

Behavior:
- keep `db_id=1` only
- bootstrap the sidecar by reinterpreting the existing DB-local `free_list` as DB-local recycle space and moving globally free chunks into the pool sidecar
- no second DB yet
- no statesync yet

Implementation notes:
- this phase should only touch `storage_pool` plus sidecar code
- it should not yet change `replace_node_writer()` or `UpdateAux::append/remove()`
- for a brand-new pool, initialize the sidecar directly from all free seq chunks
- for an existing pool, populate the sidecar from the DB-local `free_list` once, under exclusive ownership
- after Phase 1A, the rest of triedb may still behave as if `free_list` is the allocator source; that semantic flip happens in Phase 1B

#### Phase 1B: Switch triedb allocation from DB-local free list to local recycle + pool allocate

Goal: keep single-DB behavior green while changing the actual allocator boundary.

Scope:
- `mpt/detail/db_metadata.hpp`
- `mpt/update_aux.cpp`
- `mpt/trie.cpp`
- `mpt/trie.hpp`
- `mpt/cli_tool_impl.cpp`
- `mpt/test/*` touching free-list assumptions

Behavior:
- DB-local `free_list` becomes a local recycle list
- node-writer rollover stops pulling from `db_metadata()->free_list_end()` and allocates from `storage_pool` when local recycle is empty
- `capacity_in_free_list` and related free-space reporting are updated to reflect the new meaning or split into local-recycle vs global-pool metrics
- still single-DB only

Implementation notes:
- the first allocator hot spots are:
  - `UpdateAuxImpl::append()` / `remove()` in `category/mpt/update_aux.cpp`
  - `replace_node_writer_to_start_at_new_chunk()` in `category/mpt/trie.cpp`
  - `replace_node_writer()` in `category/mpt/trie.cpp`
- keep the existing list-manipulation helpers if possible; change their meaning before renaming them
- the safest sequence is:
  1. add pool allocation/free APIs
  2. add `UpdateAux` helpers such as `pop_recycle_chunk()` / `allocate_chunk_for_writer()`
  3. switch the two node-writer rollover sites to those helpers
  4. only then update accounting/reporting
- do not mix multi-DB logic into Phase 1B; keep all tests single-DB

#### Phase 2: Multi-DB open/backing plumbing

Goal: make it possible to open DB 1 and DB 2 on the same underlying device before statesync exists.

Scope:
- tests / tooling / `monad_mpt` setup for two DBs on one device
- optional shared migration-owned worker thread

Behavior:
- `monad_mpt` can initialize DB 1 + DB 2 on the same block device
- DB 1 and DB 2 have distinct metadata/root-offset backings
- DB 1 and DB 2 can both own seq chunks and allocate through the shared pool
- basic two-DB open/write/isolation works

Implementation notes:
- Phase 2 is where command/config plumbing grows up, not where allocator semantics change again
- the key callers to update are:
  - `cmd/monad/main.cpp`
  - `cmd/monad_cli.cpp`
  - any `monad_mpt` setup path that currently only knows one DB per pool
- initialize DB 2 by creating its sidecar metadata/root-offset files and then opening it through the normal `Db` path with `db_id=2`
- isolation test should prove:
  - DB 1 and DB 2 can both write
  - each DB sees only its own metadata/root history
  - both allocate from the shared pool without corrupting each other
- if shared-worker mode is used, keep it outside the generic `Db` interface: it should be a migration-tool-owned composition of two normal DB contexts

#### Phase 3: Local statesync and cutover

Goal: build the migration workflow on top of the allocator and multi-DB plumbing.

Scope:
- `cmd/monad_local_statesync.cpp`
- `cmd/CMakeLists.txt`
- statesync tests / migration integration tests

Behavior:
- source DB and migration DB run concurrently
- progressive statesync fills DB 2
- cutover quiesces DB 1, promotes DB 2, reclaims DB 1 back into the global pool

Critical process assumption:
- v1 should assume the migration tool/process owns both DB handles and the pool free-list sidecar during the migration workflow
- do not support an unrelated external writer mutating pool ownership state concurrently with the migration tool
- if that assumption changes later, the allocator protocol must become inter-process authoritative rather than rebuild-on-open

## Phase 3: Multi-DB Local Statesync

### In-Memory Bridge (~40 lines)

From `test_statesync.cpp:49-139`: `monad_statesync_client`/`monad_statesync_server_network` structs + four function pointers.

### Server Setup

1. Open shared `storage_pool` over the device
2. Open source DB (db_id=1, writable, shared pool) → `TrieDb` → `monad_statesync_server_context`
3. Clone pool read-only with `db_id=1` as well → set as `sctx.ro`

Specific mapping to current code:
- server deletion tracking lives in `category/statesync/statesync_server_context.cpp`
- finalized deletes are accumulated through `monad_statesync_server_context::commit()` plus `finalize()`
- the local migration command should reuse those existing hooks instead of inventing a second deletion log

### Commit Blocks Through Server Context

Implement a small replay loop in `monad_local_statesync.cpp` that loads finalized blocks sequentially and commits them through the server-side execution path so `monad_statesync_server_context` accumulates `FinalizedDeletions`. There is no existing `commit_sequential(...)` helper to reuse as-is.

### Client Setup + Progressive Statesync

1. Open dest DB (db_id=2, writable, shared pool, chunk_limit=N)
2. Create `monad_statesync_client_context`
3. Wire server ↔ client, `handle_new_peer` × 256
4. Progressive: `handle_target(Ti)` → drain → repeat
5. `finalize()`

Specific mapping to current code:
- use the same bridge pattern as `category/statesync/test/test_statesync.cpp`
- use the same target-driving pattern as `sync_from_some` style tests
- keep the first implementation strictly in-process; do not add sockets, RPC, or a separate transport

### Optional Shared Worker Thread

If hardware is tight, the migration process may optionally host DB 1 and DB 2 on a single shared triedb worker thread instead of one `DbAsyncWorker` per DB. In that mode, one thread owns two `AsyncIOContext` / `UpdateAux` pairs and dispatches work by `db_id`. This should stay a migration-tool concern rather than a required change to the generic `Db` API.

### Cutover

1. Statesync reaches target → dest DB has valid state
2. Stop source writes, drain statesync/server work, and close source-side handles
3. Reconfigure the live process so dest (`db_id=2`) becomes the primary DB backing
4. Destroy source DB → all its seq chunks return to the global pool free list
5. Reopen dest DB as the live execution DB

Cutover assumptions:
- reclaimed DB 1 seq chunks must not be returned to the pool until all DB 1 handles are closed
- this includes helper RO opens and any `sctx.ro` clone
- promotion should be a config/process-level switch, not an in-place mutation of DB IDs

### Files to Create/Modify

- **Create**: `cmd/monad_local_statesync.cpp`
- **Modify**: `cmd/CMakeLists.txt`

### Reference

- `test_statesync.cpp:49-196` — bridge pattern
- `test_statesync.cpp:311-491` — progressive targets (`sync_from_some`)
- `statesync_server_context.cpp:38-105` — deletion tracking
- `statesync_client.cpp:101-157` — client lifecycle

## Verification

1. `cmake --build build-claude -j$(nproc)`
2. Existing statesync tests: `ctest --test-dir build-claude -R statesync --timeout 30`
3. Single-DB backwards compat: existing `monad`/`monad-cli` unchanged
4. Two-DB pool test: create two DBs on same device, write to both, verify isolation
5. Bootstrap migration test: reinterpret DB 1's current free list as a local recycle list, carve an initial budget into the global pool, then open DB 1 and DB 2 concurrently
6. Local-reuse test: a DB compacts and reuses its own recycled chunks without touching the pool
7. Global free-list test: when a DB needs new unowned chunks, both DBs can allocate from the pool and explicit shrink/destroy returns chunks globally
8. Bulk reclaim test: destroy one DB, verify chunks returned, other DB can use them after cutover
9. Local statesync end-to-end with cutover
10. Quiesce test: verify that source handles are closed before reclaimed chunks are reused by dest
11. `db_id` plumbing test: open the same pool through writable DB, `RODb`, worker-thread DB, and `sctx.ro`, and verify each path resolves the intended DB slot
12. Shared-worker test: run DB 1 + DB 2 through one migration-owned worker thread and verify request routing / shutdown are correct

Recommended execution order for Claude:
1. land Phase 0 and keep all existing single-DB tests green
2. land Phase 1A and validate sidecar rebuild/init separately
3. land Phase 1B and validate allocator rollover separately
4. land Phase 2 with a focused two-DB open/isolation test
5. only then land Phase 3 statesync/cutover

## Design References

- **LMDB**: use dual meta pages and a single authoritative metadata plane. In the same spirit, pool ownership/free-list state should have one authority, and RO/RW opens should both resolve the same `db_id`.
- **ZFS**: treat migration like clone promotion plus checkpointed cutover. The temporary migration DB becomes primary, and reclamation of the old primary happens only after the destructive step boundary.
- **memcached**: keep the allocator state simple and central, and recycle pages/chunks through one free structure rather than many duplicated ownership views.
- **Linux kernel**: use a stable ID registry plus grace-period-style reclamation. `db_id` assignment should look like an IDA/XArray-style registry, and old-primary chunk reclaim should happen only after all readers are gone.


File	Change	Effort
`async/storage_pool.hpp`	Add pool-level global free-list helpers plus `allocate_chunk(db_id)` / `free_chunk_to_pool(chunk)` APIs. Do not add DB-placement semantics here.	Medium
`async/storage_pool.cpp`	Implement global free-list operations under the pool lock, bootstrap migration from the existing DB-local free list, and final reclaim back into the pool. Keep the existing pool footer unchanged.	Medium
`async/io.hpp` / `async/io.cpp`	Make AsyncIO use the explicit `db_id`/path-selected metadata backing instead of hardcoding CNV chunk 0.	Medium
`mpt/db.hpp`	`AsyncIOContext` accepts optional shared `storage_pool*`, and all open-path config structs carry `db_id`.	Small
`mpt/db.cpp`	`AsyncIOContext`, `Db`, and `RODb` constructors: use shared pool if provided and resolve metadata backing from explicit `db_id`/path convention.	Small
`mpt/db.cpp` (DbAsyncWorker)	Forward shared pool and `db_id` to worker threads so helper opens do not fall back to the default DB.	Small
`mpt/update_aux.cpp`	`set_io()` reads metadata backing from explicit `db_id`/path convention instead of hardcoding chunk 0. `map_root_offsets()` uses the configured backing. Keep a DB-local recycle list in `db_metadata` for owned reusable chunks; `fast`/`slow` remain in `db_metadata`, and new chunks come from the pool only when the recycle list is empty and policy allows.	Large
`mpt/ondisk_db_config.hpp`	Add `db_id` and optional `chunk_limit` to both `OnDiskDbConfig` and `ReadOnlyOnDiskDbConfig`. Valid DB IDs start at 1, and callers must pass it explicitly for every migration-related open.	Trivial
`mpt/cli_tool_impl.cpp`	Remove remaining hardcoded CNV chunk 0 assumptions in archive/import tooling so tooling can open non-primary DB slots correctly.	Small
`cmd/monad_local_statesync.cpp`	Hold the external migration lock, rebuild allocator state on open, drive statesync, promote migration DB, then reclaim old primary. Optionally host DB 1 + DB 2 on one migration-owned worker thread.	Medium
`cmd/monad_pool_freelist_sidecar.*` (new)	Read/write the global free-list sidecar and rebuild it from DB metadata when dirty or missing.	Medium

DB migration plan #2171

Description

Local Statesync Plan

Context

Phase 0: Decouple DB Metadata Backing From Fixed CNV Chunk Positions

Phase 0 Non-Goals

Cleanest Implementation Boundary

Recommended Backing Model

Preserve The Existing On-Disk Copy Layout

Do Not Treat root_offsets.storage_.cnv_chunk_id As A Physical Pool Chunk ID Anymore

Exact Phase 0 Code Changes

0.1 Add an explicit backing spec to config

0.2 Resolve backing once in AsyncIOContext

0.3 Stop making AsyncIO responsible for metadata backing

0.4 Change UpdateAux::set_io() to take the resolved backing

0.5 Keep fresh DB initialization identical except for backing selection

0.6 Update every constructor path that builds UpdateAux

0.7 Update CLI restore/import to use the same backing helpers

Phase 0 Validation

Phase 1: Shared Pool Free List

Implicit Ownership

Global Freelist: Pool-Level Lock + Index-Linked List

DB Identity and Metadata Backing

Global Free-List Recovery

Operations

Per-DB Size Configuration

On-Disk Layout

Changes to Existing Code

Phase 1 Assumptions To Keep Tight

Backwards Compatibility

Suggested Implementation Steps

Phase 0A: Introduce explicit metadata/root-offset backing selection

Phase 1A: Add global pool free-list sidecar for the existing single DB

Phase 1B: Switch triedb allocation from DB-local free list to local recycle + pool allocate

Phase 2: Multi-DB open/backing plumbing

Phase 3: Local statesync and cutover

Phase 3: Multi-DB Local Statesync

In-Memory Bridge (~40 lines)

Server Setup

Commit Blocks Through Server Context

Client Setup + Progressive Statesync

Optional Shared Worker Thread

Cutover

Files to Create/Modify

Reference

Verification

Design References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Do Not Treat `root_offsets.storage_.cnv_chunk_id` As A Physical Pool Chunk ID Anymore

0.2 Resolve backing once in `AsyncIOContext`

0.3 Stop making `AsyncIO` responsible for metadata backing

0.4 Change `UpdateAux::set_io()` to take the resolved backing

0.6 Update every constructor path that builds `UpdateAux`