Skip to content

fix(cli-command): exit non-zero on any signal-aborted run (PER-8678)#2262

Open
pranavz28 wants to merge 2 commits into
masterfrom
fix/PER-8678-signal-exit-code
Open

fix(cli-command): exit non-zero on any signal-aborted run (PER-8678)#2262
pranavz28 wants to merge 2 commits into
masterfrom
fix/PER-8678-signal-exit-code

Conversation

@pranavz28
Copy link
Copy Markdown
Contributor

Fixes PER-8678 / closes #2255

Problem

percy exec -- <cmd> could exit 0 during startup before launching the wrapped command, producing a false-positive CI success. Reported symptom: GitHub Actions reports a passing E2E run while Cypress never starts — last log is Downloading Chromium 1300309..., then minutes of [percy:monitoring] metrics, then exit 0.

Root cause

packages/cli-command/src/command.js

The per-signal handler is registered for SIGUSR1, SIGUSR2, SIGTERM, SIGINT, SIGHUP and aborts with new AbortError(signal, { signal, exitCode: 0 }). But beginShutdown only records SIGINT/SIGTERM into shutdownState.signal. The exit-code gate in the runner's catch was keyed on shutdownState.signal:

if (shutdownState.signal && err.signal && definition.exitOnError) { ... 130/143 ... }

So a SIGHUP/SIGUSR1/SIGUSR2 abort during startup:

  • skips that gate (shutdownState.signal is null), and
  • skips the if (err.exitCode !== 0) gate (the AbortError carries exitCode: 0),

…so the runner returns normally and the process exits 0 — even though percy.yield.start() was interrupted and the wrapped command was never spawned. In CI a stalled startup is typically torn down by SIGHUP; the sibling [webpack-dev-server] Gracefully shutting down line in the report corroborates an OS signal hitting the process group. PR #2199 fixed SIGINT/SIGTERM but left every other signal on the exit-0 path.

Fix

  • Extend SIGNAL_EXIT_CODES with SIGHUP→129, SIGUSR1→138, SIGUSR2→140 (POSIX 128+signum).
  • Key the exit-code gate off err.signal (the signal that actually aborted the run) instead of shutdownState.signal, with a non-zero ?? 1 fallback for any unmapped signal.

Any signal-aborted run now surfaces a non-zero exit code in production, so a startup interrupted before the wrapped command launches can never report CI success. Preserved: SIGINT→130 / SIGTERM→143, PERCY_EXIT_WITH_ZERO_ON_ERROR=true override, and exitOnError:false clean-unwind for programmatic/test usage.

Testing

  • Added a regression spec in packages/cli-command/test/shutdown.test.js: emits SIGHUP mid-run, asserts process.exitCode === 129 (was undefined/0).
  • yarn workspace @percy/cli-command test65/65 specs pass.

🤖 Generated with Claude Code

@pranavz28 pranavz28 requested a review from a team as a code owner June 2, 2026 11:29
@pranavz28 pranavz28 requested review from Shivanshu-07 and prklm10 June 2, 2026 11:29
A run aborted during startup by a non-SIGINT/SIGTERM signal
(SIGHUP/SIGUSR1/SIGUSR2) exited 0, producing a false-positive CI
success when the wrapped command never launched.

beginShutdown only records SIGINT/SIGTERM into shutdownState, so the
exit-code gate keyed on shutdownState.signal never fired for the other
registered signals; their AbortError carries exitCode:0, so the
err.exitCode!==0 gate was also skipped and the process exited 0.

Key the gate off err.signal (the signal that actually aborted the run)
and map SIGHUP/SIGUSR1/SIGUSR2 to their POSIX 128+signum codes, with a
non-zero fallback for any other signal. SIGINT(130)/SIGTERM(143),
PERCY_EXIT_WITH_ZERO_ON_ERROR, and exitOnError:false clean-unwind
behavior are all preserved. Adds a SIGHUP regression spec.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pranavz28 pranavz28 force-pushed the fix/PER-8678-signal-exit-code branch from e3b6a20 to 19c0284 Compare June 2, 2026 11:40
The `?? 1` fallback in the signal exit-code gate is unreachable in
practice — every signal in the registered handler set is mapped in
SIGNAL_EXIT_CODES — but kept as a non-zero safety net. Mark it ignored
to satisfy the package's 100% branch-coverage threshold.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pranavz28
Copy link
Copy Markdown
Contributor Author

Code Review — PER-8678

Reviewed the diff (command.js +34/−11, shutdown.test.js +17). Traced the abort→catch path and the percy exec child-process flow to check for regressions.

Summary

Fixes a false-positive CI success: percy exec could exit 0 when aborted by SIGHUP/SIGUSR1/SIGUSR2 during startup. Adds exit-code mappings for those signals and re-keys the exit-code gate off err.signal (the signal that actually aborted) instead of shutdownState.signal (which only ever records SIGINT/SIGTERM).

Review Table

Priority Category Check Status Notes
High Security Secrets / authz / input / IDOR / SQLi N/A No security surface touched
High Correctness Logic correct, edge cases ✅ Pass Bug analysis accurate; confirmed SIGHUP/USR1/USR2 previously hit the exitCode:0 AbortError path
High Correctness Explicit error handling ✅ Pass ?? 1 fallback keeps unmapped signals non-zero; PERCY_EXIT_WITH_ZERO_ON_ERROR override preserved
High Correctness No race/concurrency issues ✅ Pass AbortController.abort idempotent (first reason wins); err.signal reflects the aborting signal
Medium Testing New code has tests ✅ Pass SIGHUP→129 regression spec added
Medium Testing Error/edge paths tested ⚠️ Partial Only SIGHUP tested; USR1/USR2 + ?? 1 fallback untested (fallback is istanbul ignored). Acceptable — same code path
Medium Testing Existing tests pass ✅ Pass SIGINT→130 / SIGTERM→143 unaffected
Medium Quality Follows patterns / focused ✅ Pass Mirrors existing signal-test pattern + process.exitCode-then-return convention; tightly scoped
Low Quality Names / comments / deps ✅ Pass Comments explain why; no new deps

Notes (no blockers)

A — Linux-specific signal numbers (cosmetic). SIGUSR1: 138 / SIGUSR2: 140 use Linux signal numbers (10, 12); on macOS these are 30/31 → 158/159. Correct for Linux CI (the target environment), and SIGHUP→129 + SIGINT/SIGTERM are universal. A one-line "Linux-conventional" comment would help, but not blocking.

B — Broadened gate now intercepts any .signal-carrying error (verified-safe). Changing if (shutdownState.signal && err.signal ...) to if (err.signal && ...) routes any error with a .signal property through the signal-exit branch. Traced cli-exec/src/exec.js: a normally-failing child exits via exit(status, error, false) where error.signal is undefined, so it still falls through to the regular exit-code path. The dominant .signal carrier is the parent's own AbortError — exactly what we want to catch. Every plausible case still yields non-zero, matching the PR's goal. No regression; flagging so it doesn't have to be re-derived.

Verdict

Approve. Correct root-cause analysis, minimal well-targeted fix, comment-documented rationale, and a real regression test. The two notes are cosmetic/informational.

🤖 Generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

percy exec exits 0 during startup before launching wrapped command, causing false-positive CI success`

1 participant