docs: refocus AGENTS.md on principles and enforcement gates by thymikee · Pull Request #1097 · callstack/agent-device

thymikee · 2026-07-04T15:34:44Z

Follow-up to a week of agent-driven work (the ADR 0011 arc, the Bluesky dogfood, and ~20 worker PRs): refocus the agent-facing docs on what is expensive to rediscover, and delete what any model inspects in one rg.

AGENTS.md (net ~flat in lines, much denser in value)

Deleted: the Routing and Command Family Lookup prose maps — several entries had already drifted from the code (interaction handlers moved, response construction moved), and post-ADR-0008/0011 every fact in them lives in a parity-tested registry. Replaced with a pointer block naming the four registries as the things to read.

Added — the two sections a fresh model cannot cheaply reconstruct:

Principles: one line per incident-backed lesson from this week (path-boundary erosion, registry-claim ≠ semantic check, delegation ≠ success-path parity, don't measure before the path can fire, typed signals over message sniffing, snapshot-output token budget, warnings compose, unreleased-API deletion windows, gate-chained pushes).
Enforcement gates: the classify-don't-suppress index of all eight self-declaring gates. This repo's most distinctive property finally has a front door; the correct reflex when a gate fails (it located your incomplete change) is stated once.

Revised — module size strategy: the LOC caps stay as tripwires, but the unit is reframed as questions per file (rg → one bounded whole-file read), and the rules gain: 1:1 test-topology mirroring with the integration-aggregation exemption removed — data: interaction.test.ts is 3,408 lines and the platform index tests 3,291/2,735 precisely because tests were exempt, and they are where agents bleed the most tokens; sibling fixture modules over repeated inline literals (the contract suite is the model); claim collocation (coverage manifests, registry cells, decision-site comments); barrels only at package boundaries.

Updated: tsgo typecheck; the dev-loop staleness triple including the adopted-runner trap (shutdown hands off a runner that keeps serving the old Swift binary — a classic false negative when verifying runner changes); Gatekeeper first-node-exec stall; DEVICE_IN_USE signature; the contention-flake verification protocol; two new gate steps on the add-a-flag checklist.

CONTEXT.md

Vocabulary for the ADR 0011 domain (dispatch path, guarantee cell, owned waiver, parity table, coverage manifest, delegation-on-error, ref generation pin) and an architecture paragraph pairing ADR 0011 with ADR 0008.

docs/adr

0011 flipped to Accepted (implemented through Layer 3); new README index with "read this when…" routing and the rule that registries beat ADR prose when they disagree.

…ONTEXT.md vocabulary AGENTS.md: replace the routing/command-family prose maps (already drifting from the code) with pointers to the self-describing, parity-tested registries; add the two sections agents actually cannot rediscover cheaply — Principles (one line per incident-backed lesson) and Enforcement gates (the classify-don't-suppress index); extend the module-size guidance from raw LOC caps to answer-one-question files, 1:1 test topology mirroring (removing the integration-aggregation exemption that produced 3,400-line test files), sibling fixture modules, claim collocation, and boundary-only barrels; record the dev-loop staleness triple (dist/daemon/adopted-runner), the tsgo typecheck, the Gatekeeper first-node-exec stall, the DEVICE_IN_USE signature, and the contention-flake protocol; append the two gate steps to the new-flag checklist. CONTEXT.md: vocabulary for the ADR 0011 domain (dispatch path, guarantee cell, owned waiver, parity table, coverage manifest, delegation-on-error, ref generation pin) and an architecture paragraph positioning ADR 0011 as ADR 0008's interaction-semantics counterpart. docs/adr: flip 0011 to Accepted (implemented through Layer 3) and add a read-this-when index that names the registries as the living source of truth over ADR prose.

github-actions · 2026-07-04T15:35:14Z

Size Report

Metric	Base	Current
JS raw	1.5 MB	1.5 MB
JS gzip	489.8 kB	489.8 kB
npm tarball	588.9 kB	588.9 kB
npm unpacked	2.1 MB	2.1 MB

Startup median (7 runs, lower is better):

Scenario	Base	Current	Diff
CLI --version	27.6 ms	26.3 ms	-1.3 ms
CLI --help	52.3 ms	50.4 ms	-1.9 ms

Top changed chunks: no changes in the largest emitted chunks.

thymikee · 2026-07-04T15:45:40Z

Review status: no actionable blockers found.

I checked the AGENTS.md/CONTEXT.md/ADR edits against the current registry-driven architecture: ADR 0011 is accepted, command/routing guidance now points at the living registries instead of stale prose maps, and the new vocabulary/gate descriptions match the current contract coverage and parity-table setup. Docs-only change; GitHub checks are green (20/20). Added ready-for-human.

thymikee · 2026-07-04T16:04:27Z

Second-pass sequencing note: this docs PR now names versioned refs / ref-generation pins in AGENTS.md and CONTEXT.md, but the implementation PR (#1096, refsGeneration / @ref~s / MCP auto-pinning) is still open and this branch is not stacked on it. If #1097 merges first, main will describe behavior that is not available yet. Please either merge #1096 first or defer those two versioned-ref lines from #1097 until #1096 lands.\n\nEverything else I rechecked looks sound: the referenced ADR 0011 gates/registries exist on main, the tsdown/tsgo toolchain notes match package.json, the PR is clean, and GitHub checks are green.

Review sequencing note on #1097: these lines described #1096 behavior not yet on main. They move to #1096's branch so docs land with the implementation and the two PRs merge in any order.

thymikee · 2026-07-04T16:37:38Z

Sequencing note addressed (fe2289a): the two versioned-ref fragments are removed from this PR — the CONTEXT.md vocabulary entry moved onto #1096's branch so the term lands with the behavior it describes, and the AGENTS.md principle line dropped its parenthetical (the principle stands on its own; the example returns to force once #1096 merges). The two PRs are now order-independent. Good catch — a docs PR describing unmerged behavior is exactly the registries-over-prose failure mode this PR preaches against.

Moved from #1097 per the review sequencing note: the term lands with the behavior it describes.

thymikee · 2026-07-04T16:47:13Z

Rechecked fe2289a after the sequencing fix: the versioned-ref / ref-generation-pin fragments are gone from #1097, and the CONTEXT.md term now belongs to #1096. That makes #1097 order-independent again. Checks are green, so I restored ready-for-human.

github-actions · 2026-07-04T17:10:04Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-07-04 17:09 UTC

Moved from #1097 per the review sequencing note: the term lands with the behavior it describes.

@E12

* feat: versioned snapshot refs with MCP auto-pinning Refs are positional indexes into the latest stored session tree; #1093's coarse snapshotRefsStale marker warns honestly but cannot say WHICH tree a ref came from. Give the session a monotonically increasing snapshotGeneration, advanced wherever the stored tree is replaced: the setSessionSnapshot choke point and the snapshot/diff command path that bypasses it. Token economy (non-negotiable): the snapshot tree output is unchanged — plain e12 refs on every node. Ref-issuing responses (snapshot command, find ref outputs) carry the generation ONCE as the additive refsGeneration field. Ref-consuming commands (press/click/fill/longpress/ get/wait) accept both forms: plain @E12 keeps today's behavior including the coarse #1093 warning; pinned @E12~s3 is clean when the generation matches the stored tree, gets a precise warning naming both generations when it does not, and a malformed suffix is INVALID_ARGS with a grammar hint. Warn-only this release — tightening comes later per the compat ladder. The MCP layer auto-pins at zero token cost: it sees snapshot/find responses before the model does, remembers the last refsGeneration per session name, and rewrites plain @ref tool arguments to the pinned form before forwarding. The model never sees or types suffixes; with no remembered generation, refs pass through unpinned (never guess). Replay parsing and script writing strip and IGNORE pins — generations are meaningless outside the session that minted them. Refs #1076 * docs: CONTEXT.md vocabulary for ref generation pins Moved from #1097 per the review sequencing note: the term lands with the behavior it describes. * docs: teach the ref pin syntax in CLI help MCP agents get pins transparently (auto-pinning), but CLI-driving agents only ever met the coarse warning — refsGeneration arrived in snapshot responses with nothing explaining it, making pins an undiscoverable feature on the primary agent surface. One help line in the agent loop guidance closes that; warnings stay short (they fire repeatedly, teaching belongs in once-read surfaces). * fix: per-ref MCP pin provenance and seeded generations Review findings on the first cut: 1. The MCP layer kept ONE refsGeneration per session, so after snapshot(s12) -> find(s13) a plain @e37 from the pre-find snapshot got pinned ~s13 and read as current — recreating the find-blessing hole at the pinning layer. Replace it with per-ref provenance: Map<pinScope, Map<refBody, generation>>, scoped by state dir + session name (stateDir is a per-call MCP config field, so one server process can face multiple daemons). Merge-only updates: refs present in a ref-issuing response (snapshot nodes, digest refs, the find ref) move to its generation; absent refs KEEP their older pins — an old pin on a replaced tree is what makes the daemon warn. Never-issued refs pass through unpinned; an issuing response without refsGeneration clears the scope; memory bounded to the ~1000 most recently issued pins. 2. Generations were per-lifetime counters from 1, so a reopened session's ~s1 collided silently with the previous lifetime's. Seed the first bump at a random 6-digit base (crypto randomInt): cross-lifetime collisions become ~1e-6 — probabilistic (seeded), not identity-based, documented on the field. Pin format unchanged; within-lifetime comparisons stay exact. Tests: the MCP blessing scenario (pre-find ref stays pinned to ITS generation), the daemon half in the provider scenario (find must not bless a pre-find pin), reopen/reseed at unit + handler level, state-dir scope isolation, digest-ref merging; generation fixtures made seed-agnostic (relative bumps, echo the observed seed). Refs #1076

thymikee added the ready-for-human Valid work that needs human implementation, judgment, or maintainer merge label Jul 4, 2026

thymikee removed the ready-for-human Valid work that needs human implementation, judgment, or maintainer merge label Jul 4, 2026

thymikee mentioned this pull request Jul 4, 2026

test: slow-test ratchet and speed rules from measured experiments #1099

Merged

docs: defer versioned-ref references to the implementing PR

fe2289a

Review sequencing note on #1097: these lines described #1096 behavior not yet on main. They move to #1096's branch so docs land with the implementation and the two PRs merge in any order.

thymikee added a commit that referenced this pull request Jul 4, 2026

docs: CONTEXT.md vocabulary for ref generation pins

e6b9375

Moved from #1097 per the review sequencing note: the term lands with the behavior it describes.

thymikee added the ready-for-human Valid work that needs human implementation, judgment, or maintainer merge label Jul 4, 2026

thymikee merged commit cccd34f into main Jul 4, 2026
20 checks passed

thymikee deleted the docs/agents-md-principles branch July 4, 2026 17:09

thymikee added a commit that referenced this pull request Jul 4, 2026

docs: CONTEXT.md vocabulary for ref generation pins

68e6cba

Moved from #1097 per the review sequencing note: the term lands with the behavior it describes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: refocus AGENTS.md on principles and enforcement gates#1097

docs: refocus AGENTS.md on principles and enforcement gates#1097
thymikee merged 2 commits into
mainfrom
docs/agents-md-principles

thymikee commented Jul 4, 2026

Uh oh!

github-actions Bot commented Jul 4, 2026 •

edited

Loading

Uh oh!

thymikee commented Jul 4, 2026

Uh oh!

thymikee commented Jul 4, 2026

Uh oh!

thymikee commented Jul 4, 2026

Uh oh!

thymikee commented Jul 4, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

thymikee commented Jul 4, 2026

AGENTS.md (net ~flat in lines, much denser in value)

CONTEXT.md

docs/adr

Uh oh!

github-actions Bot commented Jul 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Size Report

Uh oh!

thymikee commented Jul 4, 2026

Uh oh!

thymikee commented Jul 4, 2026

Uh oh!

thymikee commented Jul 4, 2026

Uh oh!

thymikee commented Jul 4, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jul 4, 2026 •

edited

Loading