Skip to content

refactor(watchdog): simplify to one-shot init/tick and harden for production#16

Merged
GCdePaula merged 2 commits into
feature/watch-dogfrom
feature/watch-dog-gabriel
Jun 22, 2026
Merged

refactor(watchdog): simplify to one-shot init/tick and harden for production#16
GCdePaula merged 2 commits into
feature/watch-dogfrom
feature/watch-dog-gabriel

Conversation

@GCdePaula

@GCdePaula GCdePaula commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

Reshapes the watchdog to a single minimal compare path and closes the issues found across review.

Simplify:

  • One job, one shot. Removed the compare/advance mode split and the daemon loop. init records the canonical bootstrap state once; tick runs exactly one compare cycle and exits 0 (clean/idle) / 1 (transient) / 2 (divergence). Infra schedules re-runs and enforces non-overlap via flock; no in-process loop/lock.
  • Removed the verified-bit / advance-checkpoint provenance machinery: a persisted checkpoint is verified by construction (only a successful compare writes one), so the cheap-skip needs no extra state.

Harden / correctness:

  • Crash-safe keep-1 checkpointing: atomic head.json pointer flip + predecessor GC; only ever writes a fresh checkpoint dir (no destructive in-place rewrite).
  • init records the operator-provided bootstrap CM snapshot as the watchdog starting checkpoint; subsequent tick runs compare sequencer finalized state against CM replay when finalized inclusion advances.
  • Stream bisected eth_getLogs into the CM (no whole-range materialization), order-equivalent to the global sort.
  • L1 RPC URL supplied per tick, never persisted (no provider secret at rest).

Build / CI:

  • Vendored lua-cURLv3 in-tree for a hermetic build (no build-time download or pin to verify); Docker image builds and smoke-runs require('cartesi') in CI; graceful scheduler Inspect (no guest panic on an unknown query).
  • e2e drives the production binding incl. a real-component divergence test; e2e prerequisites hard-fail instead of skipping vacuously.

Docs updated for init/tick, head.json / config.json / WATCHDOG_STATE_DIR, flock, and the checkpoint-backup story.

…duction

Reshapes the watchdog to a single minimal compare path and closes the issues
found across review.

Simplify:
- One job, one shot. Removed the compare/advance mode split and the daemon loop.
  `init` records the canonical bootstrap state once; `tick` runs exactly one
  compare cycle and exits 0 (clean/idle) / 1 (transient) / 2 (divergence). Infra
  schedules re-runs and enforces non-overlap via flock; no in-process loop/lock.
- Removed the verified-bit / advance-checkpoint provenance machinery: a persisted
  checkpoint is verified by construction (only a successful compare writes one),
  so the cheap-skip needs no extra state.

Harden / correctness:
- Crash-safe keep-1 checkpointing: atomic head.json pointer flip + predecessor
  GC; only ever writes a fresh checkpoint dir (no destructive in-place rewrite).
- Bootstrap is verified at its own block before being trusted; exit 0 means
  verified-or-idle on every path (fails closed to 1 or 2, never a false OK).
- Stream bisected eth_getLogs into the CM (no whole-range materialization),
  order-equivalent to the global sort.
- L1 RPC URL supplied per tick, never persisted (no provider secret at rest).

Build / CI:
- Vendored lua-cURLv3 in-tree for a hermetic build (no build-time download or pin
  to verify); Docker image builds and smoke-runs require('cartesi') in CI;
  graceful scheduler Inspect (no guest panic on an unknown query).
- e2e drives the production binding incl. a real-component divergence test;
  e2e prerequisites hard-fail instead of skipping vacuously.

Docs updated for init/tick, head.json / config.json / WATCHDOG_STATE_DIR, flock,
and the checkpoint-backup story.
@GCdePaula GCdePaula requested a review from stephenctw June 20, 2026 21:29
@GCdePaula GCdePaula requested a review from stephenctw June 21, 2026 14:38
@GCdePaula GCdePaula merged commit 4679b45 into feature/watch-dog Jun 22, 2026
8 checks passed
@GCdePaula GCdePaula deleted the feature/watch-dog-gabriel branch June 22, 2026 01:07

@stephenctw stephenctw left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀🚀🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants