Skip to content

fix(sequencer): prune in failed-sync fallback so the chain can recover#24253

Open
spalladino wants to merge 1 commit into
merge-train/spartan-v5from
spl/a-1260-prune-when-cannot-build
Open

fix(sequencer): prune in failed-sync fallback so the chain can recover#24253
spalladino wants to merge 1 commit into
merge-train/spartan-v5from
spl/a-1260-prune-when-cannot-build

Conversation

@spalladino

Copy link
Copy Markdown
Contributor

Motivation

When a node fails checkSync during its slot as proposer it cannot build a checkpoint, but it already still casts governance/slashing votes so voting keeps passing even if the chain is damaged. It did not, however, call prune(). If bad data wedges the pending chain so proposers can't sync and the proof submission window then expires, nobody prunes and the chain stays stuck. This makes the proposer also prune when it can't propose, so the network can recover from data that is blocking syncing.

Approach

L1 prune() is permissionless and idempotent (reverts Rollup__NothingToPrune when not prunable, emits PrunedPending), so the proposer can call it even with a wedged local view.

  • Add a prune action to the sequencer publisher and an enqueuePruneIfPrunable(slot) method that checks canPruneAtTime — evaluated at the same L1 timestamp the bundle simulator overrides block.timestamp with — and enqueues a prune request keyed on the PrunedPending event. Fails closed on an RPC error.
  • In the failed-sync fallback (tryVoteWhenCannotBuild, renamed to tryVoteAndPruneWhenCannotBuild), enqueue the prune alongside the existing votes and submit them in the same multicall at the target slot. The early-return is relaxed so a send still fires when only a prune (and no votes) was enqueued.
  • The prune is proposer-gated (reuses the existing checkCanPropose gate), batched at the target slot, and relies on permissionless idempotency for HA dedup — no new duty type. The prune action is ordered before propose in the publisher action list.

Changes

  • sequencer-client: new SequencerPublisher.enqueuePruneIfPrunable; 'prune' action added before 'propose'; failed-sync fallback renamed to tryVoteAndPruneWhenCannotBuild and now enqueues a prune; shared dedup field renamed lastSlotForFallbackAction.
  • sequencer-client (tests): publisher and sequencer unit tests for the prune fallback (prunable/not-prunable/duplicate/RPC-fail, prune-only send, prune alongside votes).
  • end-to-end (tests): new e2e_epochs/epochs_prune_when_cannot_build that pauses sync so the proposer cannot build, expires the proof window, and asserts the pending tip winds back to proven via the fallback path; minor import normalization in two sibling epochs tests.

Fixes A-1260

When a proposer fails checkSync it already casts governance/slashing
votes; now it also calls the permissionless prune() when the rollup is
prunable, so a pending chain wedged by bad data can be wound back to
the last proven checkpoint and the network can recover.

Fixes A-1260
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant