Emergency commit #353

MartinOndejka · 2025-09-21T19:15:07Z

Until we have decentralized sequencing, we need some kind of mechanism to do emergency commit in case the sequencer goes offline. This PR implements the simple commit in case there has been no commits for some period of time.

The original commit rule stays the same.

The new emergency commit rule reuses the original rule, and checks that there has been no commit for max_sequencer_inactivity slots.

The main drawback of this approach is that the outer action state precondition can be set to the last 5 values, therefore the sequencer has to maintain commits in 5 distinct slots in the max_sequencer_inactivity window.

More detailed explanation is in the spec and rollup explanation doc.

Follwing is left to do:

tests
tooling to execute the emergency commit

MartinOndejka · 2025-10-02T11:04:10Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces an important emergency commit mechanism to handle sequencer downtime, which is a crucial feature for the network's liveness. The implementation is well-structured, especially the refactoring in rule_commit.ml to accommodate the new emergency logic while reusing existing components. The addition of the count_commits folder and the Verify_emergency_folders wrapper to handle multiple recursive proofs is a good use of the existing patterns. The documentation and spec have also been updated to reflect these changes.

I've found a few minor issues, mostly in the specification document regarding pseudocode correctness and clarity, and a suggestion to improve configuration readability by avoiding a magic number. Overall, this is a solid contribution.

src/app/zeko/circuits/design/rollup-centralized-spec.md

src/app/zeko/zeko_circuits_config/zeko_circuits_config.ml

mitschabaude · 2025-11-27T19:27:49Z

Expanding on action state limitation

Not really an issue, I just want to expand on this comment on the PR description:

The main drawback of this approach is that the outer action state precondition can be set to the last 5 values, therefore the sequencer has to maintain commits in 5 distinct slots in the max_sequencer_inactivity window.

And in the spec:

However, we can only certify “no commit happened” relative to one of
the last five outer action states. To make this check viable,
the sequencer must maintain rolling commits over at least five slots
within the max_sequencer_inactivity window. With a sufficiently
large window (e.g., on the order of a month), this obligation is
trivial for a healthy sequencer.

Here's my understanding of the issue: The emergency commit rule basically wants to say "this is the action state at the current timestamp", where the timestamp is asserted via the valid-while precondition, and the action state via the action state precondition. However, making that statement is not exactly possible, because the action state can refer to any of the last 5 times it was updated, with no guaranteed relationship to the current timestamp.

So in the worst case, the emergency commit can be exercised more easily than it should: For example, when less than 5 blocks with actions got posted onchain for about a month, that would allow us to claim "no commit for a month" even if not true, since we could use the same action state, referring to a commit from a month back, for both the start and the end of the count_commits chain:

The empty actions sequence would trivially contain 0 commits
The start commit would be further than max_sequencer_inactivity slots back
The end (= start) action state would be one of the last 5, so it would be accepted as precondition

What would be bad about this? It would let random people override the sequencer's view of the L2 state. So it's in the interest of a healthy sequencer to prevent that. Fortunately, it's easy for a healthy sequencer to prevent: doing more than 5 commits per month is sufficient. (Alternatively, they could also make sure enough actions other than commits are posted.)

If the sequencer is not healthy, and does not manage to post 5 commits in a month, we're already in the scenario that the emergency commit rule is designed for. So there's no real drawback. It's just that the condition for emergency commits to be possible is actually slightly weaker than "no commit happened in a month" (but strong enough to only apply to the intended scenario).

mitschabaude · 2025-11-27T19:28:30Z

src/app/zeko/circuits/design/rollup-centralized-explanation.md

+In effect, the emergency commit seals the gap with a bounded slot
+range, restores liveness, and lets subsequent sequencers resume
+committing under the usual rules.


Is this accurate? It's not clear to me how liveness would be restored. It seems to me that there is no mechanism to update the sequencer on the outer account, so how would we resume committing if that sequencer stays offline? Since a normal commit requires sequencer signature.

mitschabaude · 2025-11-27T19:29:14Z

In the absence of coordination, users might only be able to exit at a rate of 1 per 30 days.

Assuming that the only sequencer is gone, we must also assume that there no longer is a central place where attempted transactions are recorded.

So how would a user that wants to exit proceed? They could take the last known ledger before the sequencer stopped operating (I'm assuming they can get this information via the DA layer). They could apply their own exit transaction(s) to that ledger, e.g. deposit into the L2 bridge. They would then need confirmation (signature) from the DA layer for that updated ledger. And finally, they would run the emergency commit rule, which would allow them to get their money out.

If a single user does the above, without coordination to include more exit transactions than his own, then that would lock the emergency commit rule for another 30 days (since it results in a new commit.)

There's a cartoon version of this where we have (say) a 1000 users and they take 1000 months (~80 years) to all exit because they do it one at a time :D

Obviously, that's not what would happen: We would likely see coordination, to come up with a new effective sequencer, that allows all users to post their exits and builds a ledger that includes all of them.

However, there is no mechanism that guarantees such a coordinated effort to succeed in being the first to manage to exit. They would still race with any rogue/uncoordinated attempts to be the ones to exit after 30 days. So there is a real danger that it could take multiple 30-day periods for the mechanism to be effective for most users.

I'm not sure how bad you think this is in practice. I think it could be addressed easily: An emergency commit could toggle the contract to be in "emergency mode". From that point on, you would no longer enforce the 30 day minimum timespan, so after the first emergency commit everyone could exit quickly.

mitschabaude · 2025-11-27T19:29:38Z

This PR covers sequencer liveness, but what about DA layer liveness?

We didn't audit the DA layer last time, but it seems to us that it's a multisig where signatures from all participants are required to make any L1 commit.

It seems that liveness would be entirely broken if just one of those keys were lost or one of the nodes would stop operating. This looks like an even more problematic failure case than that of a dead sequencer, and it isn't addressed by the emergence commit mechanism.

MartinOndejka added 5 commits September 21, 2025 20:33

Implement circuit for emergency commit

5a380e7

Fix sequencer

d445456

Add emergency commit spec

dd4751a

fmt

fc1387c

Add emergency commit explanation

174c162

MartinOndejka self-assigned this Sep 21, 2025

MartinOndejka requested a review from L-as as a code owner September 21, 2025 19:15

MartinOndejka force-pushed the emergency-commit branch 3 times, most recently from d60163f to 86a0934 Compare September 21, 2025 19:25

Simplify emergency commit spec

548fb07

MartinOndejka force-pushed the emergency-commit branch from 86a0934 to 548fb07 Compare September 21, 2025 19:26

gemini-code-assist bot reviewed Oct 2, 2025

View reviewed changes

src/app/zeko/circuits/design/rollup-centralized-spec.md Show resolved Hide resolved

src/app/zeko/circuits/design/rollup-centralized-spec.md Show resolved Hide resolved

src/app/zeko/zeko_circuits_config/zeko_circuits_config.ml Show resolved Hide resolved

mitschabaude reviewed Nov 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Emergency commit #353

Emergency commit #353

Uh oh!

MartinOndejka commented Sep 21, 2025 •

edited

Loading

Uh oh!

MartinOndejka commented Oct 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mitschabaude commented Nov 27, 2025 •

edited

Loading

Uh oh!

mitschabaude Nov 27, 2025

Uh oh!

mitschabaude commented Nov 27, 2025

Uh oh!

mitschabaude commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Emergency commit #353

Are you sure you want to change the base?

Emergency commit #353

Uh oh!

Conversation

MartinOndejka commented Sep 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MartinOndejka commented Oct 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mitschabaude commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Expanding on action state limitation

Uh oh!

mitschabaude Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

mitschabaude commented Nov 27, 2025

In the absence of coordination, users might only be able to exit at a rate of 1 per 30 days.

Uh oh!

mitschabaude commented Nov 27, 2025

This PR covers sequencer liveness, but what about DA layer liveness?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MartinOndejka commented Sep 21, 2025 •

edited

Loading

mitschabaude commented Nov 27, 2025 •

edited

Loading