Runner resilience under agent-abuse command patterns

Found by the first headless-Haiku benchmark run (#1101): a small-model agent hammering the CLI in tight act/observe loops (38 turns in ~6.7 min, including malformed and rapid-fire commands) wedged the iOS runner hard enough that snapshot requests kept timing out even across a subsequent `open --relaunch`. The daemon correctly survived (preserve-on-snapshot-timeout policy from #1075) but the runner never self-recovered — it needed a process kill.

This is a new workload class: adversarial-by-incompetence. Strong models pace themselves; small models retry fast, interleave commands mid-flight, and abandon sessions mid-request. Best-in-class agent reliability means the runner recovers from this without human intervention.

Investigation starters:
- Reproduce with a scripted rapid-fire pattern (the benchmark harness in #1101 can be adapted); capture the runner state when wedged (last diag phases, whether the AX-unavailable invalidation fired and got stuck re-acquiring).
- The `--relaunch` path should arguably detect a runner whose last N commands all timed out and recycle it proactively (kill + fresh start) instead of reusing the session — the keep-hot machinery has the health signals (readiness preflight, invalidate/restart-and-replay) but apparently a state exists where none of them trigger recycling.
- Consider a daemon-side circuit breaker: M consecutive runner command timeouts → forced runner restart on next command, with a diagnostic explaining why.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Runner resilience under agent-abuse command patterns #1105

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Runner resilience under agent-abuse command patterns #1105

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions