Skip to content

docs: refocus AGENTS.md on principles and enforcement gates#1097

Merged
thymikee merged 2 commits into
mainfrom
docs/agents-md-principles
Jul 4, 2026
Merged

docs: refocus AGENTS.md on principles and enforcement gates#1097
thymikee merged 2 commits into
mainfrom
docs/agents-md-principles

Conversation

@thymikee

@thymikee thymikee commented Jul 4, 2026

Copy link
Copy Markdown
Member

Follow-up to a week of agent-driven work (the ADR 0011 arc, the Bluesky dogfood, and ~20 worker PRs): refocus the agent-facing docs on what is expensive to rediscover, and delete what any model inspects in one rg.

AGENTS.md (net ~flat in lines, much denser in value)

Deleted: the Routing and Command Family Lookup prose maps — several entries had already drifted from the code (interaction handlers moved, response construction moved), and post-ADR-0008/0011 every fact in them lives in a parity-tested registry. Replaced with a pointer block naming the four registries as the things to read.

Added — the two sections a fresh model cannot cheaply reconstruct:

  • Principles: one line per incident-backed lesson from this week (path-boundary erosion, registry-claim ≠ semantic check, delegation ≠ success-path parity, don't measure before the path can fire, typed signals over message sniffing, snapshot-output token budget, warnings compose, unreleased-API deletion windows, gate-chained pushes).
  • Enforcement gates: the classify-don't-suppress index of all eight self-declaring gates. This repo's most distinctive property finally has a front door; the correct reflex when a gate fails (it located your incomplete change) is stated once.

Revised — module size strategy: the LOC caps stay as tripwires, but the unit is reframed as questions per file (rg → one bounded whole-file read), and the rules gain: 1:1 test-topology mirroring with the integration-aggregation exemption removed — data: interaction.test.ts is 3,408 lines and the platform index tests 3,291/2,735 precisely because tests were exempt, and they are where agents bleed the most tokens; sibling fixture modules over repeated inline literals (the contract suite is the model); claim collocation (coverage manifests, registry cells, decision-site comments); barrels only at package boundaries.

Updated: tsgo typecheck; the dev-loop staleness triple including the adopted-runner trap (shutdown hands off a runner that keeps serving the old Swift binary — a classic false negative when verifying runner changes); Gatekeeper first-node-exec stall; DEVICE_IN_USE signature; the contention-flake verification protocol; two new gate steps on the add-a-flag checklist.

CONTEXT.md

Vocabulary for the ADR 0011 domain (dispatch path, guarantee cell, owned waiver, parity table, coverage manifest, delegation-on-error, ref generation pin) and an architecture paragraph pairing ADR 0011 with ADR 0008.

docs/adr

0011 flipped to Accepted (implemented through Layer 3); new README index with "read this when…" routing and the rule that registries beat ADR prose when they disagree.

…ONTEXT.md vocabulary

AGENTS.md: replace the routing/command-family prose maps (already
drifting from the code) with pointers to the self-describing,
parity-tested registries; add the two sections agents actually cannot
rediscover cheaply — Principles (one line per incident-backed lesson)
and Enforcement gates (the classify-don't-suppress index); extend the
module-size guidance from raw LOC caps to answer-one-question files,
1:1 test topology mirroring (removing the integration-aggregation
exemption that produced 3,400-line test files), sibling fixture
modules, claim collocation, and boundary-only barrels; record the
dev-loop staleness triple (dist/daemon/adopted-runner), the tsgo
typecheck, the Gatekeeper first-node-exec stall, the DEVICE_IN_USE
signature, and the contention-flake protocol; append the two gate
steps to the new-flag checklist.

CONTEXT.md: vocabulary for the ADR 0011 domain (dispatch path,
guarantee cell, owned waiver, parity table, coverage manifest,
delegation-on-error, ref generation pin) and an architecture paragraph
positioning ADR 0011 as ADR 0008's interaction-semantics counterpart.

docs/adr: flip 0011 to Accepted (implemented through Layer 3) and add
a read-this-when index that names the registries as the living source
of truth over ADR prose.
@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown

Size Report

Metric Base Current Diff
JS raw 1.5 MB 1.5 MB 0 B
JS gzip 489.8 kB 489.8 kB 0 B
npm tarball 588.9 kB 588.9 kB 0 B
npm unpacked 2.1 MB 2.1 MB 0 B

Startup median (7 runs, lower is better):

Scenario Base Current Diff
CLI --version 27.6 ms 26.3 ms -1.3 ms
CLI --help 52.3 ms 50.4 ms -1.9 ms

Top changed chunks: no changes in the largest emitted chunks.

@thymikee thymikee added the ready-for-human Valid work that needs human implementation, judgment, or maintainer merge label Jul 4, 2026
@thymikee

thymikee commented Jul 4, 2026

Copy link
Copy Markdown
Member Author

Review status: no actionable blockers found.

I checked the AGENTS.md/CONTEXT.md/ADR edits against the current registry-driven architecture: ADR 0011 is accepted, command/routing guidance now points at the living registries instead of stale prose maps, and the new vocabulary/gate descriptions match the current contract coverage and parity-table setup. Docs-only change; GitHub checks are green (20/20). Added ready-for-human.

@thymikee

thymikee commented Jul 4, 2026

Copy link
Copy Markdown
Member Author

Second-pass sequencing note: this docs PR now names versioned refs / ref-generation pins in AGENTS.md and CONTEXT.md, but the implementation PR (#1096, refsGeneration / @ref~s / MCP auto-pinning) is still open and this branch is not stacked on it. If #1097 merges first, main will describe behavior that is not available yet. Please either merge #1096 first or defer those two versioned-ref lines from #1097 until #1096 lands.\n\nEverything else I rechecked looks sound: the referenced ADR 0011 gates/registries exist on main, the tsdown/tsgo toolchain notes match package.json, the PR is clean, and GitHub checks are green.

@thymikee thymikee removed the ready-for-human Valid work that needs human implementation, judgment, or maintainer merge label Jul 4, 2026
Review sequencing note on #1097: these lines described #1096 behavior
not yet on main. They move to #1096's branch so docs land with the
implementation and the two PRs merge in any order.
@thymikee

thymikee commented Jul 4, 2026

Copy link
Copy Markdown
Member Author

Sequencing note addressed (fe2289a): the two versioned-ref fragments are removed from this PR — the CONTEXT.md vocabulary entry moved onto #1096's branch so the term lands with the behavior it describes, and the AGENTS.md principle line dropped its parenthetical (the principle stands on its own; the example returns to force once #1096 merges). The two PRs are now order-independent. Good catch — a docs PR describing unmerged behavior is exactly the registries-over-prose failure mode this PR preaches against.

thymikee added a commit that referenced this pull request Jul 4, 2026
Moved from #1097 per the review sequencing note: the term lands with
the behavior it describes.
@thymikee

thymikee commented Jul 4, 2026

Copy link
Copy Markdown
Member Author

Rechecked fe2289a after the sequencing fix: the versioned-ref / ref-generation-pin fragments are gone from #1097, and the CONTEXT.md term now belongs to #1096. That makes #1097 order-independent again. Checks are green, so I restored ready-for-human.

@thymikee thymikee added the ready-for-human Valid work that needs human implementation, judgment, or maintainer merge label Jul 4, 2026
@thymikee thymikee merged commit cccd34f into main Jul 4, 2026
20 checks passed
@thymikee thymikee deleted the docs/agents-md-principles branch July 4, 2026 17:09
@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown
PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-07-04 17:09 UTC

thymikee added a commit that referenced this pull request Jul 4, 2026
Moved from #1097 per the review sequencing note: the term lands with
the behavior it describes.
thymikee added a commit that referenced this pull request Jul 4, 2026
* feat: versioned snapshot refs with MCP auto-pinning

Refs are positional indexes into the latest stored session tree; #1093's
coarse snapshotRefsStale marker warns honestly but cannot say WHICH tree
a ref came from. Give the session a monotonically increasing
snapshotGeneration, advanced wherever the stored tree is replaced: the
setSessionSnapshot choke point and the snapshot/diff command path that
bypasses it.

Token economy (non-negotiable): the snapshot tree output is unchanged —
plain e12 refs on every node. Ref-issuing responses (snapshot command,
find ref outputs) carry the generation ONCE as the additive
refsGeneration field. Ref-consuming commands (press/click/fill/longpress/
get/wait) accept both forms: plain @E12 keeps today's behavior including
the coarse #1093 warning; pinned @E12~s3 is clean when the generation
matches the stored tree, gets a precise warning naming both generations
when it does not, and a malformed suffix is INVALID_ARGS with a grammar
hint. Warn-only this release — tightening comes later per the compat
ladder.

The MCP layer auto-pins at zero token cost: it sees snapshot/find
responses before the model does, remembers the last refsGeneration per
session name, and rewrites plain @ref tool arguments to the pinned form
before forwarding. The model never sees or types suffixes; with no
remembered generation, refs pass through unpinned (never guess).

Replay parsing and script writing strip and IGNORE pins — generations
are meaningless outside the session that minted them.

Refs #1076

* docs: CONTEXT.md vocabulary for ref generation pins

Moved from #1097 per the review sequencing note: the term lands with
the behavior it describes.

* docs: teach the ref pin syntax in CLI help

MCP agents get pins transparently (auto-pinning), but CLI-driving
agents only ever met the coarse warning — refsGeneration arrived in
snapshot responses with nothing explaining it, making pins an
undiscoverable feature on the primary agent surface. One help line in
the agent loop guidance closes that; warnings stay short (they fire
repeatedly, teaching belongs in once-read surfaces).

* fix: per-ref MCP pin provenance and seeded generations

Review findings on the first cut:

1. The MCP layer kept ONE refsGeneration per session, so after
   snapshot(s12) -> find(s13) a plain @e37 from the pre-find snapshot got
   pinned ~s13 and read as current — recreating the find-blessing hole at
   the pinning layer. Replace it with per-ref provenance:
   Map<pinScope, Map<refBody, generation>>, scoped by state dir + session
   name (stateDir is a per-call MCP config field, so one server process
   can face multiple daemons). Merge-only updates: refs present in a
   ref-issuing response (snapshot nodes, digest refs, the find ref) move
   to its generation; absent refs KEEP their older pins — an old pin on a
   replaced tree is what makes the daemon warn. Never-issued refs pass
   through unpinned; an issuing response without refsGeneration clears
   the scope; memory bounded to the ~1000 most recently issued pins.

2. Generations were per-lifetime counters from 1, so a reopened
   session's ~s1 collided silently with the previous lifetime's. Seed
   the first bump at a random 6-digit base (crypto randomInt):
   cross-lifetime collisions become ~1e-6 — probabilistic (seeded), not
   identity-based, documented on the field. Pin format unchanged;
   within-lifetime comparisons stay exact.

Tests: the MCP blessing scenario (pre-find ref stays pinned to ITS
generation), the daemon half in the provider scenario (find must not
bless a pre-find pin), reopen/reseed at unit + handler level, state-dir
scope isolation, digest-ref merging; generation fixtures made
seed-agnostic (relative bumps, echo the observed seed).

Refs #1076
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-human Valid work that needs human implementation, judgment, or maintainer merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant