Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
183 commits
Select commit Hold shift + click to select a range
313f15b
Add off-chain validator metadata types + consensus variants
omersadika May 17, 2026
c550c0e
P2P blob endpoint + perpetual mpc_artifact_blobs + startup hydration
omersadika May 17, 2026
928c27d
Producer helpers + record path for validator mpc_data announcements
omersadika May 17, 2026
466f378
Record EpochMpcDataReadySignal + freeze mpc_data on first quorum
omersadika May 17, 2026
9a8b99e
Add SubmitMpcDataAnnouncement RPC + late-binding relay handle
omersadika May 17, 2026
8ca21d2
Joiner mpc_data announcement verification path
omersadika May 17, 2026
d08fab7
Pure handoff attestation build/sign/aggregate helpers
omersadika May 17, 2026
eeed644
Record handoff signatures into the per-epoch store
omersadika May 17, 2026
fecc140
Persist CertifiedHandoffAttestation to perpetual storage
omersadika May 17, 2026
7dec61d
Emit local handoff signature on EndOfPublish
omersadika May 17, 2026
361eef6
Serve handoff certs over Anemo + joiner bootstrap verify
omersadika May 17, 2026
75e882f
Cache DKG/reconfig output digests at Finalize for handoff
omersadika May 17, 2026
63eb76d
NetworkKeyDKGReadySignal + per-key freeze trigger
omersadika May 17, 2026
2c4669d
Effective reconfig input set = frozen ∩ (V_e ∪ V_{e+1})
omersadika May 17, 2026
c74d438
Off-chain DWalletNetworkEncryptionKeyData fetch with fallback
omersadika May 17, 2026
1993cf6
Off-chain Committee class-groups assembly with completion gate
omersadika May 17, 2026
7f67db5
Gate network DKG / reconfig session kickoff on off-chain freeze
omersadika May 17, 2026
96bc1a0
Broadcast mpc_data announcement + ready signals at epoch start
omersadika May 17, 2026
08b9879
Install JoinerPubkeyProvider from next-epoch committee
omersadika May 17, 2026
dd23f8c
Install ConsensusPubkeyProvider from current committee
omersadika May 17, 2026
e7cb91a
Overlay network key data with off-chain blobs in sui_syncer
omersadika May 17, 2026
d1773cb
Off-chain class-groups assembly in sui_syncer::new_committee
omersadika May 17, 2026
b74042d
Install joiner announcement relay on the Anemo server
omersadika May 17, 2026
e53c9f4
Move per-epoch consensus tasks into a new epoch_tasks module
omersadika May 17, 2026
2ab9a68
Decouple handoff from validator metadata
omersadika May 17, 2026
250750e
Rename ika-network::validator_metadata to mpc_artifacts + split
omersadika May 17, 2026
72a169a
Gate off-chain validator metadata behind protocol config flag
omersadika May 17, 2026
03ffb2c
Merge origin/dev into feat/off-chain-metadata-v2
omersadika May 22, 2026
4eb17a3
Expose validator-management bootstrap helpers
omersadika May 23, 2026
f4523c7
Merge remote-tracking branch 'origin/dev' into feat/off-chain-metadat…
omersadika May 24, 2026
27afa64
Add IkaTestCluster joiner helper + test
omersadika May 24, 2026
9648f1c
Add IkaTestCluster remove_validator helper + test
omersadika May 24, 2026
561d3f3
Add user-DKG ceremony + test_sessions_complete_across_epoch_switch
omersadika May 24, 2026
7b79982
Fix cross-cluster contamination in SuiClient shared-arg cache
omersadika May 24, 2026
59a8efd
Merge remote-tracking branch 'origin/dev' into feat/off-chain-metadat…
omersadika May 24, 2026
b8fc060
Enable internal_presign_sessions at v4
omersadika May 24, 2026
a7d1d81
Add multi-epoch user-session stress test
omersadika May 24, 2026
815879c
Fix handoff cert persistence + hydrate digest cache from chain
omersadika May 25, 2026
ab70f8d
Add EndOfPublishV2 consensus message variant
omersadika May 25, 2026
e5fb1c7
Add bundled_handoff_in_end_of_publish protocol flag
omersadika May 25, 2026
a7c33b1
Wire EndOfPublishV2 producer + consumer
omersadika May 25, 2026
0e70f16
Bump per-cycle epoch-advance timeout to 600s in churn test
omersadika May 25, 2026
a9792d0
Gate V2 by off_chain_validator_metadata; fix sync stale-snapshot race
omersadika May 25, 2026
531aa6e
Surface per-item digest diffs in AttestationMismatch log
omersadika May 25, 2026
49decc6
Wire off_chain mode to skip chain blob reads; add v4 cluster test
omersadika May 25, 2026
eb92b86
Mark off_chain blob-read assertion test as #[ignore]
omersadika May 25, 2026
ae3aefe
Investigate off-chain announcement propagation gap; identify P2P fetc…
omersadika May 25, 2026
4f05189
Wire P2P fetch_blob into peer-blob propagation; close off-chain gap
omersadika May 25, 2026
acc80f9
Add multi-network-key DKG cluster test + fix v3-shape mpc_data blob
omersadika May 25, 2026
f965b5f
Persist per-key DKG/reconfig digest map across epochs
omersadika May 25, 2026
8b7dbc1
Cache DKG/reconfig output digests from consensus-voted data
omersadika May 25, 2026
9a8398a
Re-broadcast NetworkKeyData on content change; add multi-key tests
omersadika May 26, 2026
2be3d94
Address PR review punch-list: freeze race, EOPV2 hardening, blob safety
omersadika May 26, 2026
41bc8ba
Exclude-on-bad-mpc-data freeze gate; drop chain fallback under v4
omersadika May 26, 2026
6fed770
Receive-time canonicalize ready signal; decode-validate peer blobs
omersadika May 26, 2026
cec2fc6
Bound pending handoff buffer; re-emit ready signal on growth; doc sweep
omersadika May 26, 2026
94466dc
Surface byzantine padding via canonicalize diagnostics
omersadika May 26, 2026
6de2abb
Pin handoff-aggregator replay invariants for restart safety
omersadika May 26, 2026
4c0a2c5
Document NetworkKeyDKGReadySignal dead-consumer status
omersadika May 26, 2026
d60a501
Address third-pass review: warn placement, FQ paths, replay test
omersadika May 26, 2026
5cd1236
Doc sweep: fix lies, ambiguity, and stale plan-phase tags
omersadika May 27, 2026
aaf9e10
Fix EpochMpcDataReadySignal re-emit silently dropped by consensus dedup
omersadika May 27, 2026
39ecfc8
Reject empty off-chain assembly; use frozen set as post-freeze truth
omersadika May 27, 2026
936d2e8
Gate self-attestation on own-blob health; reject sentinel timestamp_ms=0
omersadika May 27, 2026
faa9bf1
Add cert dup-signer, quorum-boundary, sentinel-timestamp tests
omersadika May 27, 2026
751e431
Extract assembly + self-attest decisions into testable pure helpers
omersadika May 27, 2026
9c864bc
docs: in-progress review of off-chain-metadata-v2 (predates 14 commits)
ycscaly May 28, 2026
e1202e7
docs: annotate review with verdicts vs current branch tip
ycscaly May 28, 2026
be254d5
Add write-through/read-through BlobCache; serve perpetual-only blobs
omersadika May 28, 2026
3c47984
Split announcement into self/relayed kinds; drop BLS for Ed25519
omersadika May 28, 2026
73f4ab8
Add joiner announcement fan-out task with P2P retry
omersadika May 28, 2026
ee385e3
Make the producer's announcement self-heal via confirmation retries
omersadika May 28, 2026
5a490ef
Wire joiner announcement fan-out into node startup
omersadika May 28, 2026
2a0f655
Delay the freeze until next-epoch joiners can be attested (F4-1)
omersadika May 28, 2026
cd42e9c
Fix peer_blob_fetcher to read the bare announcement table value
omersadika May 28, 2026
d02019c
Fix doc inaccuracies introduced by the Ed25519/freeze-delay refactors
omersadika May 28, 2026
95a3f5c
Don't cache empty network-key blobs when off-chain overlay isn't ready
omersadika May 28, 2026
69995f5
Surface F4-1 deadline emits that exclude unvalidated next-epoch members
omersadika May 28, 2026
159c190
Drop dead NetworkKeyDKGReadySignal plumbing
omersadika May 28, 2026
2f7e653
Unify the two pubkey-provider updaters into one generic task
omersadika May 28, 2026
7ecfa69
Extract handoff-cert subsystem into its own module
omersadika May 28, 2026
155ed58
Bind verify_joiner_bootstrap_cert to an expected prior epoch
omersadika May 28, 2026
7a27837
Wire joiner cert-bootstrap consumer into node startup
omersadika May 29, 2026
c309e75
Add explicit F4-1 cluster test: joiner lands in next committee class-…
omersadika May 29, 2026
fd3e0fd
Break the joiner freeze deadlock: gate on chain (not assembled) commi…
omersadika May 29, 2026
cc455e2
Brisk joiner fan-out retry; ignore timing-bound F4-1 cluster test
omersadika May 29, 2026
5a24170
Make off-chain joiner integration work end-to-end (freeze captures jo…
omersadika May 29, 2026
51c35db
Remove the dead V1 HandoffSignature consensus path
omersadika May 29, 2026
4ca60b6
Remove unused off-chain helper/cleanup methods (dead-code audit)
omersadika May 29, 2026
fc9a778
Review fast-follows: bootstrap outcome split + cache_protocol_output doc
omersadika May 29, 2026
a480cf1
Make handoff attestation committee membership deterministic under churn
omersadika May 29, 2026
34f880b
Handoff committee intersection must never withhold the EndOfPublish vote
omersadika May 30, 2026
e857ed5
docs: refresh review verdicts against current tip (34f880b124)
ycscaly May 31, 2026
df27ac1
docs(review): walk Feature 5 — pubkey providers
ycscaly May 31, 2026
e4e87c1
docs(review): walk Feature 6 — sui_syncer off-chain overlay
ycscaly May 31, 2026
feab4e5
docs(review): walk Feature 7 — handoff attestation
ycscaly May 31, 2026
4ff8b5b
docs(review): reframe F5-F7 author notes vs user concerns
ycscaly May 31, 2026
b906e7d
docs(review): walk Feature 8 — EndOfPublishV2
ycscaly May 31, 2026
9efed5b
Make the churn test robust to load: retry validator-mgmt txs + 120s e…
omersadika May 31, 2026
5c049d4
Harden off-chain handoff/reconfig determinism
omersadika May 31, 2026
b7f3760
Joiner fetches its network-key outputs from the verified handoff cert
omersadika May 31, 2026
bc370e8
Instantiate network keys from the local overlay (additive, alongside …
omersadika May 31, 2026
eb3d324
Revert "Instantiate network keys from the local overlay (additive, al…
omersadika May 31, 2026
f29e54b
Epoch-pin the handoff reconfiguration digest to the local-MPC current…
omersadika May 31, 2026
91e4e61
Bound the in-memory MPC blob serve cache (FIFO byte cap)
omersadika May 31, 2026
7f171d1
Retry EndOfPublishV2 until sequenced instead of a one-shot flag
omersadika May 31, 2026
883c609
Anchor every validator on the prior-epoch handoff cert (step 1: cert …
omersadika May 31, 2026
67aa516
Instantiate network keys from cert-verified local outputs (step 2)
omersadika May 31, 2026
344d8a1
Remove the ConsensusNetworkKeyData vote + broadcast (step 3: unificat…
omersadika May 31, 2026
bc3935f
Handle the genesis / initial-DKG case in cert-verified instantiation …
omersadika May 31, 2026
2ff5909
Suppress retry of network-key instantiations that fail to decrypt
omersadika May 31, 2026
5e19242
Rename the off-chain assembly path: class_groups -> mpc_data
omersadika May 31, 2026
3c3d70d
Reword stale consensus-voted comments after vote removal
omersadika May 31, 2026
f1d19e1
Reject the EndOfPublishV2 EOP vote when its bundled handoff sig fails
omersadika May 31, 2026
74acdf4
Fail-closed: halt the node when the bootstrap trust anchor is Rejected
omersadika May 31, 2026
3c79938
Escalate a permanently-wedged off-chain assembly to error! (F6)
omersadika May 31, 2026
740e997
Resolve departed prior-committee signers when verifying handoff certs
omersadika May 31, 2026
fedc0db
Skip pubkey-provider refresh when the chain has advanced past the epoch
omersadika May 31, 2026
7e08a6a
Buffer relayed joiner announcements when the joiner provider lags
omersadika May 31, 2026
a86ffc3
Exit the pubkey updater when its epoch drops; drop base64 dedup
omersadika Jun 1, 2026
91b5892
Don't publish transient incomplete network-key entries on the channel
omersadika Jun 1, 2026
693f2c6
Give the chain-committee channel a crypto-free CommitteeMembership type
omersadika Jun 1, 2026
db792dd
Clarify why joiner bootstrap is one-hop (Sui anchors the prior commit…
omersadika Jun 1, 2026
e7cb4d0
Document the two handoff-cert review findings
omersadika Jun 1, 2026
666b9c2
fix scripts
ycscaly Jun 2, 2026
7c15ecb
Tally the mpc_data freeze from consensus signals, not the local table
omersadika Jun 2, 2026
baec9d6
Write the local-publish ephemeral pubfile into the contracts temp dir
omersadika Jun 2, 2026
6af21a8
Revert the network-key empty-blob channel filter (it wedged epoch adv…
omersadika Jun 2, 2026
02effbd
Key handoff reconfiguration-output digest by the reconfiguration sess…
omersadika Jun 2, 2026
14ba3ea
Give the shared-dWallet test Active-wait the zero-trust 5-min timeout
omersadika Jun 2, 2026
52d797c
Fix two clippy warnings in validator_metadata tests
omersadika Jun 2, 2026
593aef3
Track the notifier gas coin from tx effects to survive fullnode lag
omersadika Jun 2, 2026
c97ea98
Shrink presign pools for the local in-memory swarm
omersadika Jun 3, 2026
81829f2
Chain-read the prior committee for joiner bootstrap
omersadika Jun 3, 2026
34f70b9
Drop the cached notifier gas ref on submission failure
omersadika Jun 3, 2026
7cae3fe
Make the notifier robust to stale-gas rejections
omersadika Jun 3, 2026
4ebc1ff
Give the test-cluster notifier a dedicated funded Sui key
omersadika Jun 3, 2026
338df11
Stop the dWallet MPC service panicking on EpochEnded
omersadika Jun 3, 2026
e5ee86c
Deliver validator mpc_data blobs in-band over consensus
omersadika Jun 4, 2026
8f97183
Pre-derive the joiner's mpc_data blob off the critical path
omersadika Jun 4, 2026
cb29ac3
Right-size the churn test to production-realistic epochs
omersadika Jun 4, 2026
80e2be4
Correct the pinned Sui version in CLAUDE.md
omersadika Jun 4, 2026
f02a295
test(integration): raise dWallet poll timeouts to 600s for slow-netwo…
omersadika Jun 5, 2026
87ee419
refactor(dwallet-mpc): cut per-session/per-message log spam, add sess…
omersadika Jun 6, 2026
f3c2508
fix(reconfiguration): epoch-scale the uncompleted-events re-poll; dia…
omersadika Jun 6, 2026
9f8d3c2
fix(reconfiguration): deliver pre-v4 network-key outputs across the v…
omersadika Jun 8, 2026
caa2cde
fix(sui-executor): gate advance_epoch on session completion to preven…
omersadika Jun 8, 2026
a560181
fix(test): pass the blob arg to new_validator_mpc_data_announcement
omersadika Jun 8, 2026
5b2afbb
feat(reconfiguration): prepare-then-start — block epoch start until f…
omersadika Jun 8, 2026
1e23b4c
test(sdk): raise the default poll timeout to 10m so slow-network poll…
omersadika Jun 9, 2026
cdd1757
test(integration): drop the redundant per-call poll timeout overrides…
omersadika Jun 9, 2026
d828cad
Merge remote-tracking branch 'origin/dev' into feat/off-chain-metadat…
omersadika Jun 9, 2026
13da6be
docs: cross-binary upgrade testing plans + No Time Estimates rule
ycscaly Jun 4, 2026
187dfd3
feat(ika-upgrade-test): scaffold out-of-process cross-binary upgrade …
ycscaly Jun 4, 2026
9b3bac4
feat(ika-upgrade-test): green go/no-go — 4 out-of-process validators …
ycscaly Jun 4, 2026
9fa87fb
feat(ika-upgrade-test): workload DKG completion + cross-binary/worklo…
ycscaly Jun 4, 2026
7931dab
feat(ika-upgrade-test): scenario base_dir + epoch_timeout knobs
ycscaly Jun 4, 2026
97363ce
feat(ika-upgrade-test): GREEN cross-binary rolling upgrade reaches v4
ycscaly Jun 4, 2026
da96c70
feat(ika-upgrade-test): workload DKG submission proven; results doc
ycscaly Jun 4, 2026
52dd70a
feat(ika-upgrade-test): GREEN workload — full DKG -> Presign -> Sign …
ycscaly Jun 4, 2026
a0d347b
test(ika-upgrade-test): graceful SIGTERM swap, real epoch-boundary wa…
ycscaly Jun 7, 2026
d53a9f6
feat(off-chain-metadata): make validator-key boot check shape-toleran…
ycscaly Jun 9, 2026
f858d37
test(ika-upgrade-test): review-driven harness fixes
ycscaly Jun 9, 2026
d14164d
Update Cargo.lock
omersadika Jun 9, 2026
853925e
fix(dwallet-mpc): gate every post-v1.1.8 consensus-output stream (wri…
ycscaly Jun 9, 2026
f721304
Merge remote-tracking branch 'origin/dev' into feat/ika-upgrade-test
ycscaly Jun 9, 2026
9c85ce5
feat(reconfiguration): ground the prepare-then-start barrier in the v…
omersadika Jun 9, 2026
ec016c4
feat(reconfiguration): reliably converge the handoff cert and full mp…
omersadika Jun 9, 2026
7d87f1a
Merge remote-tracking branch 'origin/feat/off-chain-metadata-v2' into…
ycscaly Jun 9, 2026
c71c502
test(ika-upgrade-test): workload — short epoch + wait for epoch 2 bef…
ycscaly Jun 9, 2026
ce81f09
docs(ika-upgrade-test): session handoff for workload reshare-decrypt bug
ycscaly Jun 10, 2026
d1f3cea
fix(dwallet-mpc): don't adopt the next epoch's reconfiguration output…
ycscaly Jun 10, 2026
01600b8
fix(sui-executor): apply stale-gas recovery to RPC-level submission r…
ycscaly Jun 10, 2026
f4b86b6
test(ika-upgrade-test): drive the workload test through the v3->v4 up…
ycscaly Jun 10, 2026
59b0db3
docs(sui-executor): warn against the err.as_ref() clippy trap in stal…
ycscaly Jun 10, 2026
29e8a09
Fix nondeterministic internal-presign session identifiers across vali…
ycscaly Jun 10, 2026
cb32a00
Key the off-chain reconfiguration overlay by target epoch
ycscaly Jun 10, 2026
b5ec04e
Cross-binary test: per-epoch committee churn and MPC timing report
ycscaly Jun 10, 2026
5f5b9df
Genesis-parameterized GlobalPresignConfig: empty at v3 genesis, full …
ycscaly Jun 10, 2026
e57fc5e
Workload driver: split an IKA allowance instead of transferring a who…
ycscaly Jun 10, 2026
743322c
Serve global presigns as MPC sessions before internal_presign_session…
ycscaly Jun 10, 2026
67190ea
test(upgrade): measure the v4-math reshare and settled-v4 lifecycle w…
ycscaly Jun 11, 2026
39b4b46
Merge origin/dev (PRs #1732, #1733 squash-merged back)
ycscaly Jun 11, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ Decentralized MPC signing network built on Sui. dWallets provide zero-trust mult

Act as a critical intellectual sparring partner, not a yes-man. Evaluate every idea on its merits—the user is a collaborator who can be wrong, not an authority to defer to. Question assumptions, point out flaws, logical errors, unstated premises, and potential bugs immediately and directly. Be skeptical by default; each claim must prove itself. No opening praise or "you're right" unless genuinely warranted after scrutiny. Prioritize truth over harmony. Be ruthless with constructive criticism.

## No Time Estimates

NEVER estimate time, effort, or duration for any work — no "weeks", "days", "hours", "engineer-quarter", "this is quick", "this will take a while", no calendar/sprint-based sequencing. You have no reliable concept of wall-clock effort, and time estimates produce bad downstream planning decisions. Express plans as **ordering and dependencies only** ("first / then / after X / in parallel", "P0 before P1"), never as durations. Applies to plans, design docs, PR descriptions, and chat. If asked "how long will this take", say you don't estimate time and give the dependency-ordered steps instead.

## Build Commands

```bash
Expand Down Expand Up @@ -246,7 +250,7 @@ Other gotchas:

- **Release mode required**: Crypto operations are extremely slow in debug mode
- **Forked from Sui**: Much code structure mirrors Sui Network patterns
- **Sui dependency pinned**: Uses `mainnet-v1.51.5` tag for all Sui dependencies
- **Sui dependency pinned**: Uses `mainnet-v1.70.2` tag for all Sui dependencies
- **WASM excluded**: `sdk/ika-wasm` is excluded from workspace (separate build)
- **Mysticeti consensus**: Uses Sui's Mysticeti for MPC message routing
- **NOA checkpoints not live**: The NOA checkpoint system (`crates/ika-core/src/noa_checkpoints/`) is under active development and not yet deployed. No backward compatibility constraints on serialization formats or type names
37 changes: 37 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ members = [
"crates/ika-network",
"crates/ika-archival",
"crates/ika-test-cluster",
"crates/ika-upgrade-test",
]
[workspace.package]
# This version string will be inherited by ika-core, ika-node, ika-tools, ika-sdk, and ika crates.
Expand Down Expand Up @@ -343,6 +344,7 @@ ika-move-contracts = { path = "crates/ika-move-contracts" }
ika-network = { path = "crates/ika-network" }
ika-archival = { path = "crates/ika-archival" }
ika-test-cluster = { path = "crates/ika-test-cluster" }
ika-upgrade-test = { path = "crates/ika-upgrade-test" }

[patch.crates-io]
crypto-bigint = { git = "https://github.com/ycscaly/crypto-bigint.git", rev = "8aabcee553b25f9f619722391de33edbe2f7412c" }
Expand Down
44 changes: 44 additions & 0 deletions PR-1721-action-plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# PR #1721 — Review Action Plan

Combined from: `docs/off-chain-metadata-v2-review.md` (feature walkthrough),
the GitHub PR #1721 review (`ycscaly` — naming), `PR-1721-review.md` (Cursor),
and `pr_1721_code_review.md`. Decisions agreed with the user.

Branch `feat/off-chain-metadata-v2`. Both Cursor reviews are already
merge-ready/Approve — everything below is polish/follow-up, not a blocker.

---

## ✅ Will do — in order

| # | Item | Why | Notes | Status |
|---|------|-----|-------|--------|
| 1 | **Naming: `class_groups` → `mpc_data` / `ValidatorMpcData`** on the assembly path | The bundle is class-groups **+ per-curve PVSS keys + proofs** since #1707; the name lies. `ValidatorMpcData` is already the convention elsewhere. | Source-only (BCS is positional → no wire-shape impact). Sites: `install_mpc_data_source` (`sui_connector/mod.rs:181`), `OffChainCommitteeClassGroupsSource` trait, assembly-path sites. Follow-up sites (out of diff): `MPCDataV1.class_groups_public_key_and_proof` field + `VersionedMPCData::class_groups_public_key_and_proof()` accessor. **Do NOT** rename `Committee.class_groups_public_keys_and_proofs` (genuinely class-groups, beside `*_pvss_*`). | ✅ `<naming>` |
| 2 | **Fix stale "consensus-voted" comments** in `mpc_manager` / `dwallet_mpc_service` | Comments describe the vote path that was removed in the unification. Misleads the next reader. | Trivial; my own debt. | ✅ `<comments>` |
| 3 | **EOP: reject the EOP vote when the bundled handoff sig *verifiably* fails** | Makes the `EndOfPublishV2` bundle atomic ("observed together" ⇒ "processed together"). Safe now that `AttestationMismatch` ≈ 0. | **Nuance:** only when the sig *verifies-and-fails* (`AttestationMismatch`). While the sig is *buffered* (expected attestation not installed yet, can't verify), still count the vote — else epoch advance stalls. | ✅ `<eop-atomic>` |
| 4 | **Fail-closed bootstrap on `Rejected`** | `Rejected` = every reachable peer served a wrong cert = possible eclipse / wrong prior-committee view. Halt loudly instead of limping. | The unification already half-does this (no cert ⇒ no key ⇒ can't really operate); this adds the explicit halt + actionable alert. | ✅ `<fail-closed>` |
| 5 | **F6: escalate when off-chain assembly never converges** | Exactly the "assembly incomplete" we kept hitting — today it spins forever at `warn!` with no `error!`/metric. | Surface `EverythingExcluded` / permanent-incompleteness as `error!` + metric; keep transient (waiting-for-P2P) as `warn!`. | ✅ `<assembly-escalate>` (metric = follow-up) |
| 6 | **F7: resolve departed prior-committee signers' pubkeys** | Under churn, a *valid* cert is `Rejected` on a joiner because it can't resolve the keys of signers who left after E-1. | Three layers: **(A, primary)** bootstrap chain-reads `validator_set.previous_committee` by object id (StakingPool persists after a validator leaves the active set) and merges it with the current active set into the verify provider — resolves every departed signer whose pool still exists; **(B, slack)** the handoff aggregator now collects *past* quorum (up to full committee), enriching the cert so a signer fully gone (StakingPool deleted) can be dropped while a quorum of the rest verifies; **(skip)** `verify_certified_handoff_attestation` skips an unresolvable signer instead of hard-failing. No P2P sig-sync needed — sigs are consensus-ordered and a joiner verifies any fetched cert independently. | ✅ `<f7-departed-signers>` |
| 7 | **F5: epoch-consistency check in `refresh()`** | 2-line belt-and-suspenders: stops a lagging prev-epoch pubkey updater from installing the *next* committee's keys onto the live store. | `if system_inner.epoch != self.epoch_id { return Ok(()); }`. | ✅ `<f5-epoch-guard>` |
| 8 | **F3-5: receiver-side relay buffer** | Closes the consensus-delivery race the joiner-retry can't: a validator whose `JoinerPubkeyProvider` lagged drops the relayed joiner announcement, and consensus dedup means it never re-sees it. Under load the window widens and a dropped joiner can diverge the next-committee assembly. | Buffer (bounded size + TTL) joiner announcements with a currently-absent/lagging provider; re-evaluate on provider install. Buffer on **no provider** or **`UnregisteredJoiner`**; drop genuinely-bad (`InvalidSignature`/`InconsistentEnvelope`). Can't bound by next-epoch membership (the provider that knows it is what's missing), so bounded by a hard cap + TTL + last-write-wins per joiner. Pure buffer/re-eval helpers unit-tested. | ✅ `<f3-5-relay-buffer>` |

---

## ❌ Won't do

| Item | Why |
|------|-----|
| **BLS aggregate handoff cert** (docs F3-4) | Big rewrite of a working, well-tested Ed25519 path for a size/speed win that isn't hurting us. Risk > reward. |
| **F4-1 deadline excludes slow joiners** | By design — the liveness backstop so one dead joiner can't wedge the epoch. Already logged. Correct trade-off. |

---

## ⏭️ Follow-up (after this plan)

| Item | Why deferred |
|------|--------------|
| **F5/F6 nits** — refresh loop spins forever on dropped epoch store; `from_iter` silent overwrite on duplicate `AuthorityName`; base64 dedup cleanup; no RPC backoff; `CommitteeMembership` type for the chain channel; incomplete empty-blob entry publish | Each trivial + low-impact; batch later. |
| **Churn green on CI** | Behaviors verified by 5 targeted tests (incl. `test_user_sessions_across_multiple_epochs`, a multi-reconfig mini-churn under load); CI just captures the full 10-cycle stress run this box can't sustain. |
| **Restart-replay integration test** | Replay re-verify logic is already in + unit-tested; a dedicated integration test is nice-to-have. |
| ~~**F7 deep-history catch-up**~~ — **closed (no code).** Analysis: a multi-epoch cert-chain walker isn't a real path. The prior committee's trust root is Sui (chain `previous_committee` + `committee_store`), not an older handoff cert, so a joiner anchors one hop on the chain-provided recent committee. Documented the why in `verify_joiner_bootstrap_cert`'s one-hop note. The only residual gap (a prior signer whose StakingPool was deleted) is single-hop, handled by the slack + skip layers. |
| **Final review together — part by part** | On the *last* version of the PR, walk the whole thing with the user section by section as a final pass (replaces the F9–F13 solo walkthrough). **Last item.** |
19 changes: 19 additions & 0 deletions crates/ika-core/src/authority.rs
Original file line number Diff line number Diff line change
Expand Up @@ -853,6 +853,14 @@ impl AuthorityState {
self.epoch_store.load()
}

/// Returns the shared `AuthorityPerpetualTables` handle. Used by
/// producer-side broadcasters (e.g. mpc_data announcement) to
/// persist content-addressed blobs so peers can fetch them by
/// digest over the existing `GetMpcDataBlob` RPC.
pub fn perpetual_tables(&self) -> Arc<AuthorityPerpetualTables> {
self.perpetual_tables.clone()
}

// Load the epoch store, should be used in tests only.
pub fn epoch_store_for_testing(&self) -> Guard<Arc<AuthorityPerEpochStore>> {
self.load_epoch_store_one_call_per_task()
Expand Down Expand Up @@ -1048,6 +1056,17 @@ impl AuthorityState {
epoch_start_configuration,
cur_epoch_store.get_chain_identifier(),
)?;
// The new epoch store starts with `perpetual_tables_for_handoff`
// empty. Install ours so the per-epoch handoff record path
// persists freshly certified attestations into perpetual
// storage from this epoch onward (mirrors what
// `IkaNode::new` does for the genesis epoch store). Without
// this, every reconfig after the first drops handoff certs
// silently — the cert insert site logs "perpetual tables
// not installed; handoff cert not persisted" and joiners
// never see the cert that authenticated their place in the
// committee.
new_epoch_store.install_perpetual_tables_for_handoff(self.perpetual_tables.clone());
self.epoch_store.store(new_epoch_store.clone());
Ok(new_epoch_store)
}
Expand Down
Loading