test(validator-client): deflake integration test with frozen clock#24259
Open
spalladino wants to merge 1 commit into
Open
test(validator-client): deflake integration test with frozen clock#24259spalladino wants to merge 1 commit into
spalladino wants to merge 1 commit into
Conversation
The re-execution deadline in validator.integration.test.ts is an absolute
timestamp (the slot attestation deadline) compared against the date
provider's clock. TestDateProvider advances with real wall-clock, so the
~24s slot-1 re-execution budget started ticking at the beforeEach clock
anchor and was consumed by the two heavy validator-context setups before
block re-execution ran. On a loaded CI machine a block that should
re-execute successfully instead hit the deadline, processed 0 txs, and was
rejected as an empty non-first block ("Cannot add empty block that is not
the first block in the checkpoint") -- the same rejection the mana-limit
test expects only for the overflowing block.
Switch the suite to the frozen ManualDateProvider so re-execution never
races real time. The two tests that rely on a retryUntil timing out now
advance the clock past the attestation deadline so the timeout fires
immediately instead of polling the full window.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
validator.integration.test.ts(suiteValidatorClient Integration) has been flaking on bothnextandmerge-train/spartan-v5— failed ~7 times, e.g.rejects block that would exceed checkpoint mana limit. It is a true flake: a docs-only PR ran the same command 3×, failing once and passing twice.Root cause: the block re-execution deadline is an absolute timestamp — the slot attestation deadline (
target_slot_start + S − 2E≈ 24s for the test's constants) — compared against the date provider's clock in the public processor (now() > deadline→ "Stopping tx processing due to timeout"). The suite usedTestDateProvider, whosenow()advances with real wall-clock from thebeforeEachanchor. So the ~24s slot-1 re-execution budget started ticking before the two heavycreateValidatorContextsetups. On a loaded CI machine, fixture setup (~22s observed) consumed the budget before block re-execution ran; a block that should re-execute successfully instead hit the deadline, processed 0 txs, and was rejected as an empty non-first block ("Cannot add empty block that is not the first block in the checkpoint") — the same rejection the mana-limit test expects only for the overflowing block.Approach
TestDateProvider(drifts with real time) to the frozenManualDateProvider, whose clock only moves on explicitsetTime/advanceTime. Re-execution can no longer race wall-clock, so slow fixture setup never eats the per-slot budget.retryUntiltiming out (refuses to attest if not all block proposals were processed,refuses to attest with archive mismatch) would otherwise poll the full ~24s window in real time under a frozen clock. They now advance the clock past slot 1's attestation deadline beforeattestToCheckpointProposal, so the (correct) timeout fires immediately.No assertion was relaxed; the behavioral checks are unchanged.
Changes
ManualDateProviderinvalidator.integration.test.ts; advance the frozen clock past the attestation deadline in the two timeout-dependent tests.Verification
validateBlockProposalreturns false).beforeEach(which would blow the old 24s budget) still passes with the frozen clock.yarn build,yarn format,yarn lintclean.Fixes A-1265