Skip to content

Conversation

@MartinOndejka
Copy link
Collaborator

@MartinOndejka MartinOndejka commented Sep 21, 2025

Until we have decentralized sequencing, we need some kind of mechanism to do emergency commit in case the sequencer goes offline. This PR implements the simple commit in case there has been no commits for some period of time.

The original commit rule stays the same.

The new emergency commit rule reuses the original rule, and checks that there has been no commit for max_sequencer_inactivity slots.

The main drawback of this approach is that the outer action state precondition can be set to the last 5 values, therefore the sequencer has to maintain commits in 5 distinct slots in the max_sequencer_inactivity window.

More detailed explanation is in the spec and rollup explanation doc.

Follwing is left to do:

  • tests
  • tooling to execute the emergency commit

@MartinOndejka MartinOndejka self-assigned this Sep 21, 2025
@MartinOndejka MartinOndejka requested a review from L-as as a code owner September 21, 2025 19:15
@MartinOndejka MartinOndejka force-pushed the emergency-commit branch 3 times, most recently from d60163f to 86a0934 Compare September 21, 2025 19:25
@MartinOndejka
Copy link
Collaborator Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an important emergency commit mechanism to handle sequencer downtime, which is a crucial feature for the network's liveness. The implementation is well-structured, especially the refactoring in rule_commit.ml to accommodate the new emergency logic while reusing existing components. The addition of the count_commits folder and the Verify_emergency_folders wrapper to handle multiple recursive proofs is a good use of the existing patterns. The documentation and spec have also been updated to reflect these changes.

I've found a few minor issues, mostly in the specification document regarding pseudocode correctness and clarity, and a suggestion to improve configuration readability by avoiding a magic number. Overall, this is a solid contribution.

@mitschabaude
Copy link
Contributor

mitschabaude commented Nov 27, 2025

Expanding on action state limitation

Not really an issue, I just want to expand on this comment on the PR description:

The main drawback of this approach is that the outer action state precondition can be set to the last 5 values, therefore the sequencer has to maintain commits in 5 distinct slots in the max_sequencer_inactivity window.

And in the spec:

However, we can only certify “no commit happened” relative to one of
the last five outer action states. To make this check viable,
the sequencer must maintain rolling commits over at least five slots
within the max_sequencer_inactivity window. With a sufficiently
large window (e.g., on the order of a month), this obligation is
trivial for a healthy sequencer.

Here's my understanding of the issue: The emergency commit rule basically wants to say "this is the action state at the current timestamp", where the timestamp is asserted via the valid-while precondition, and the action state via the action state precondition. However, making that statement is not exactly possible, because the action state can refer to any of the last 5 times it was updated, with no guaranteed relationship to the current timestamp.

So in the worst case, the emergency commit can be exercised more easily than it should: For example, when less than 5 blocks with actions got posted onchain for about a month, that would allow us to claim "no commit for a month" even if not true, since we could use the same action state, referring to a commit from a month back, for both the start and the end of the count_commits chain:

  • The empty actions sequence would trivially contain 0 commits
  • The start commit would be further than max_sequencer_inactivity slots back
  • The end (= start) action state would be one of the last 5, so it would be accepted as precondition

What would be bad about this? It would let random people override the sequencer's view of the L2 state. So it's in the interest of a healthy sequencer to prevent that. Fortunately, it's easy for a healthy sequencer to prevent: doing more than 5 commits per month is sufficient. (Alternatively, they could also make sure enough actions other than commits are posted.)

If the sequencer is not healthy, and does not manage to post 5 commits in a month, we're already in the scenario that the emergency commit rule is designed for. So there's no real drawback. It's just that the condition for emergency commits to be possible is actually slightly weaker than "no commit happened in a month" (but strong enough to only apply to the intended scenario).

Comment on lines +142 to +144
In effect, the emergency commit seals the gap with a bounded slot
range, restores liveness, and lets subsequent sequencers resume
committing under the usual rules.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this accurate? It's not clear to me how liveness would be restored. It seems to me that there is no mechanism to update the sequencer on the outer account, so how would we resume committing if that sequencer stays offline? Since a normal commit requires sequencer signature.

@mitschabaude
Copy link
Contributor

In the absence of coordination, users might only be able to exit at a rate of 1 per 30 days.

Assuming that the only sequencer is gone, we must also assume that there no longer is a central place where attempted transactions are recorded.

So how would a user that wants to exit proceed? They could take the last known ledger before the sequencer stopped operating (I'm assuming they can get this information via the DA layer). They could apply their own exit transaction(s) to that ledger, e.g. deposit into the L2 bridge. They would then need confirmation (signature) from the DA layer for that updated ledger. And finally, they would run the emergency commit rule, which would allow them to get their money out.

If a single user does the above, without coordination to include more exit transactions than his own, then that would lock the emergency commit rule for another 30 days (since it results in a new commit.)

There's a cartoon version of this where we have (say) a 1000 users and they take 1000 months (~80 years) to all exit because they do it one at a time :D

Obviously, that's not what would happen: We would likely see coordination, to come up with a new effective sequencer, that allows all users to post their exits and builds a ledger that includes all of them.

However, there is no mechanism that guarantees such a coordinated effort to succeed in being the first to manage to exit. They would still race with any rogue/uncoordinated attempts to be the ones to exit after 30 days. So there is a real danger that it could take multiple 30-day periods for the mechanism to be effective for most users.

I'm not sure how bad you think this is in practice. I think it could be addressed easily: An emergency commit could toggle the contract to be in "emergency mode". From that point on, you would no longer enforce the 30 day minimum timespan, so after the first emergency commit everyone could exit quickly.

@mitschabaude
Copy link
Contributor

This PR covers sequencer liveness, but what about DA layer liveness?

We didn't audit the DA layer last time, but it seems to us that it's a multisig where signatures from all participants are required to make any L1 commit.

It seems that liveness would be entirely broken if just one of those keys were lost or one of the nodes would stop operating. This looks like an even more problematic failure case than that of a dead sequencer, and it isn't addressed by the emergence commit mechanism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants