feat(overseer): inbox substrate + v0 prioritizer (Step 2.5)#57
feat(overseer): inbox substrate + v0 prioritizer (Step 2.5)#57heavygee wants to merge 57 commits into
Conversation
Soup verify: scratchlist layer already uses @/lib/relative-time; keep canonical path and avoid TS2300 duplicate identifier at driver merge.
Parallel stress-test stops can race child exit cleanup; returning true when the session is already gone matches ensure-stopped semantics.
Adds opt-in FCM HTTP v1 notification delivery so a companion mobile/wearable
app can receive permission, ready, and task notifications end-to-end. The
channel is gated entirely on FCM_SERVICE_ACCOUNT_PATH + FCM_PROJECT_ID being
set; operators not running a companion see zero behavior change.
What lands:
- POST/DELETE /api/devices/register — JWT-authed FCM token registry,
upsert on (namespace, deviceId, platform), platforms `phone` | `wear`.
- Sqlite v9 → v10 migration adds `fcm_devices` (idx on namespace + token).
- FcmService — minimal HTTP v1 client, RS256 service-account JWT via
jose (dep already in tree), 5-minute access-token cache, 401 retry.
- FcmNotificationChannel — implements NotificationChannel, sends data-only
FCM (so companion can route to phone+watch surfaces). Body composition
parses an optional trailing `AGENT_NOTIFY_SUMMARY {json}` line for richer
ready summaries; truncates plain assistant text to 280 chars otherwise.
Tags each payload with `severity` (info/warning/success/error) so clients
can color/categorise the notification.
- PushNotificationChannel gains a NativeFallbackProbe — when a namespace
has at least one registered FCM device, web-push and SSE in-page toast
are skipped so the operator does not double-notify on phone+browser.
Probe is no-op when no FCM device is registered; PWA-only setups
unchanged. Branch trace gated on HAPI_NOTIFY_DEBUG=1.
- shared/src/messages.ts — `extractAssistantPlainText` (codex + Claude SDK
shapes) and `extractNotifySummary` (strict end-anchored line parser).
- hub/src/notifications/toolArgs.ts — tool-arg formatters lifted out of
telegram/sessionView (kept duplicated there in this PR; refactor of
Telegram is a follow-up).
- docs/api/native-companion-contract.md — payload + endpoints + env vars,
versioned at contract v1.
Test coverage:
- 260 hub tests pass (incl. 23 new across FCM channel, push dedup,
v10 migration, devices route).
- 60 shared tests pass (messages parsers).
Notes for reviewers:
- Reference companion implementation lives in a separate Android repo
(Kotlin, phone APK + Wear OS APK) — this PR is hub-side only.
- No new runtime deps (`jose` and `zod` already declared in hub).
Co-authored-by: Cursor <cursoragent@cursor.com>
…ub-on-phone Adds a Scope section to the native-companion contract so anyone implementing it knows the audience: operators running the hub on a server who want phone/watch as a notification surface, not users expecting a Termux-bundled hub. Mirrors the framing now in heavygee/hapi-companion README. Co-authored-by: Cursor <cursoragent@cursor.com>
Removes the prior framing that referenced a non-existent 'Termux hub-on-phone' alternative. This contract describes a native client to the same hub the PWA talks to; it does not change where the hub runs. Co-authored-by: Cursor <cursoragent@cursor.com>
Companion section in Settings renders a QR code encoding the deeplink hapicompanion://bind?hub=<base>&code=<token>. Scanning it from the HAPI companion app (Android phone or Wear OS) auto-fills the bind form and authenticates against this hub - no manual URL/token paste. QR is gated behind a Show button so the access token doesn't sit visible on screen by default; a Copy link affordance and the textual deeplink are also exposed for manual onboarding. Adds qrcode + @types/qrcode to web/ (already a hub dep, no new resolved package - just a workspace declaration). Co-authored-by: Cursor <cursoragent@cursor.com>
After the existing PWA access QR is rendered on tunnel start, also print the hapicompanion://bind?hub=...&code=... deeplink and a matching QR. Same tunnel + token, different scheme: phones with the companion app installed pick up the deeplink via the manifest intent filter; phones without it ignore it and fall back to the PWA QR above. QR rendering failure is non-fatal in both cases - the textual deeplink above the QR is sufficient for manual paste. Co-authored-by: Cursor <cursoragent@cursor.com>
Two bugs surfaced by the upstream review bot:
1) Web Push silently dropped when FCM is not actually configured.
The native-fallback probe only checked the device registry; it did
not check whether resolveFcmConfig() actually succeeded. So an
operator who previously enabled FCM, registered a phone, then later
started the hub WITHOUT FCM_SERVICE_ACCOUNT_PATH would see the probe
return true (devices still in DB) -> Web Push suppressed -> no FCM
channel registered -> notifications go to /dev/null.
Fix: extracted the probe construction into buildNativeFallbackProbe()
which short-circuits to () => false when fcmConfig is missing. Probe
never even consults the device store in the no-config branch, so
stale rows can never matter.
2) Transient FCM failures permanently unregistered devices.
sendToToken() returned a single boolean and sendToNamespace() removed
any device whose send returned false. A 429 (rate limit), 503
(server error), 401 (auth glitch), or even an ECONNREFUSED would
delete the device row, after which the user would need to re-pair to
get notifications again. The bot caught it; the fix is the obvious
one.
Fix: sendToToken() now returns 'sent' | 'invalid' | 'failed'.
- 'invalid' is reserved for the responses that genuinely indicate a
dead token: HTTP 404 with UNREGISTERED/NOT_FOUND, and HTTP 400
with INVALID_ARGUMENT explicitly referencing the token field.
- Everything else (429, 5xx, 401, 403, network errors) is 'failed'
and counts toward the failed tally without removing the device.
sendToNamespace() only calls removeDeviceByToken() on 'invalid'.
Tests: 11 new tests across two new files. fcmService.test.ts covers
all six branches (200, 404 unregistered, 429, 503, 401, network error)
plus a mixed-batch case that proves invalid tokens get removed in the
same call where transient-failure tokens survive. nativeFallbackProbe
.test.ts covers both no-config and configured branches plus the
explicit "no-config never touches the store" guarantee.
Hub test count: 273 -> 284 (all passing).
Co-authored-by: Cursor <cursoragent@cursor.com>
…ent type HAPI Bot review on PR tiann#803 caught two contract-doc accuracy gaps: 1) Visibility rule was wrong. Doc said "FCM fires when Web Push would fire AND client not visible via SSE", but FcmNotificationChannel ALWAYS fires regardless of PWA visibility (deliberately - native companion is the canonical wrist-first surface, and there is a passing test asserting this). Companion app implementers reading the contract would have built foreground-suppression logic and then dropped notifications when the PWA tab was open. 2) Documented `session-completed` event doesn't exist. NotificationHub never calls into a 'session-completed' channel method on FcmNotificationChannel; the type would never reach a native client. Removed from the documented enum, leaving only the three actual events: ready, permission-request, task-notification. Co-authored-by: Cursor <cursoragent@cursor.com>
…h break Co-authored-by: Cursor <cursoragent@cursor.com>
…works The Settings -> Companion pairing QR reads the original CLI access token from localStorage (hapi_access_token::<baseUrl>) so it can be encoded into the hapicompanion://bind deeplink. For browser/CLI logins useAuthSource already persists the token via setAccessToken, but the Telegram Mini App bind path went through useAuth.bind() which exchanged the typed CLI token for a JWT and never persisted it. Telegram users therefore always saw the "signed in via Telegram..." fallback and got no usable QR. After a successful client.bind() we now mirror useAuthSource's behavior and write the same accessToken to the same localStorage key, restoring parity between the two auth paths. No change for browser/CLI users. Co-authored-by: Cursor <cursoragent@cursor.com>
The native-fallback probe previously returned true whenever FCM was
configured AND devices were registered, which suppressed web-push for
the namespace. The HAPI Bot correctly pointed out the gap: if the FCM
pipeline silently breaks (expired service-account key, sustained 5xx,
OAuth token-fetch failure, network blackhole) the operator gets nothing
on either channel until they manually intervene.
Approach (deliberate, not the bot's exact suggested fix):
- FcmService now keeps a small rolling window (last 8 outcomes) of send
attempts and exposes `isHealthy()`. The threshold is 5+/8 failures =
unhealthy; the buffer starts empty so a freshly-booted hub is
optimistic ("innocent until proven guilty") and does not double-fire
on event #1.
- Token-fetch failure (`getFcmAccessToken` throws) now records exactly
one health-failure (not one per device), short-circuits the send
loop, and returns a result so `sendToNamespace` no longer leaks the
exception.
- `invalid` token responses are explicitly excluded from the health
buffer because they are per-device facts (rotated/uninstalled token),
not pipeline failures - FCM was reachable, it just rejected one
stale token.
- `buildNativeFallbackProbe` now optionally accepts the FcmService and
short-circuits to "let web-push fire" when health is bad, before it
even queries the device registry. The single-arg call shape is still
supported for back-compat.
Why not the bot's exact suggestion ("invert: call FCM first, fall back
on result.sent === 0"):
- Couples PushNotificationChannel to FcmService and FcmSendPayload,
reversing the clean parallel-channel architecture established earlier
in this PR.
- Treats every transient single-event failure as fallback-worthy, which
re-opens the duplicate-notification race that the suppression logic
was added to close (FCM HTTP timeout that delivers later + the web
push we sent in the meantime = two pings).
- A rolling health window only flips on sustained breakage, which is
the actual operational scenario the bot is worried about.
The wrist-first design intent ("FCM fires unconditionally, web-push is
suppressed for the same namespace") documented in
docs/api/native-companion-contract.md is preserved on the happy path.
The probe only re-enables web-push when there is concrete evidence the
native pipeline is not delivering.
Tests:
- New FcmService.isHealthy suite covers empty-buffer, threshold flip,
recovery as failures age out of the window, invalid-token exclusion,
and network-error path.
- nativeFallbackProbe gains coverage for the unhealthy-but-registered,
healthy-and-registered, and absent-fcmService (back-compat) cases.
- All 292 hub tests still pass; typecheck clean.
Co-authored-by: Cursor <cursoragent@cursor.com>
…dule The Telegram session view had its own copy of formatToolArgumentsDetailed identical to the one in hub/src/notifications/toolArgs.ts (already used by the FCM channel). Replace the local copy with an import. Removes ~70 lines of duplication, plus the now-unused MAX_TOOL_ARGS_LENGTH constant and `truncate` import. The shared signature accepts an optional opts arg whose default maxArgLength is 150 - matching the prior constant - so the call site is unchanged. Two benign upgrades come along for the ride from the shared module: ?? instead of || on field fallbacks (no real-world difference; permission arguments never carry empty-string fields), and String(...) wrapping plus a typeof object guard that makes non-string values render gracefully instead of throwing into the catch block. Hub tests: 311 pass / 0 fail. Telegram subset: 5 pass / 0 fail. typecheck green. Cold-reviewed by an out-of-context Claude Opus peer before push. Co-authored-by: Cursor <cursoragent@cursor.com>
…ng web-push Addresses HAPI Bot Major review on PR tiann#803. The previous health gate treated an empty outcome buffer as healthy ("innocent until proven guilty"). That created a silent-blackhole window on cold start with broken FCM credentials: the push channel suppressed SSE/Web Push for the first ~5 events while the FCM channel attempted each delivery and recorded failures, until enough stacked to flip the threshold. Every notification in that gap was silently lost. New invariant: isHealthy() requires at least one successful FCM send in the recent window (HEALTH_WINDOW=8) AND failures below threshold (HEALTH_FAILURE_THRESHOLD=5). Both conditions are necessary; either alone is insufficient evidence to safely suppress web-push fallback. Trade-off: one duplicated notification per hub restart per namespace. On the first event after restart, web-push fires alongside FCM (because the gate has no positive evidence yet). Once FCM records that first success, the gate engages and subsequent events are FCM-only. Worth it for guaranteed delivery during cold-start outages. Tests reworked to match new semantics: - "starts UNHEALTHY with empty buffer" (was: healthy) - "flips to healthy after first successful send" (new) - "stays unhealthy across failures-only run" (new, exercises the exact blackhole scenario the bot flagged) - "flips back to unhealthy after threshold breach with prior successes" (renamed, establishes successes first) - "invalid tokens don't count against health" (reworked: send a mixed batch first to establish health, then verify invalids don't flip it) - "network errors count as failures" (reworked: establish health first) Hub tests: 313 pass / 0 fail. typecheck green. Co-authored-by: Cursor <cursoragent@cursor.com>
…9→V10 Upstream/main landed sessions.service_tier at schema v10. The companion FCM device registry now migrates at v11 so both changes compose cleanly after the courtesy rebase onto current upstream/main. Co-authored-by: Cursor <cursoragent@cursor.com>
FCM runs before web-push; PushNotificationChannel skips web/SSE only when the same notify() dispatch already delivered via FCM. Removes the isHealthy()+device-row probe that could suppress web-push after warm FCM outages. Co-authored-by: Cursor <cursoragent@cursor.com>
FcmNotificationChannel now implements the optional sendModelError hook so
NotificationHub can reach the native companion when cursor-agent hits a
model-side failure. Uses shared modelErrorCopy strings, severity=error,
distinct model-error-${sessionId}-${atTs} tags, and always calls deliver()
(wrist-first, no SSE shortcut).
Depends on feat/companion-fcm-push-api and feat/cursor-detect-inline-model-errors
both being lower in the driver soup stack.
Co-authored-by: Cursor <cursoragent@cursor.com>
Rebase follow-up: truncate AGENT_NOTIFY_SUMMARY summary/action before FCM data payload (bot Major). Fix usePwaUpdate.test.ts setTimeout mock cast so bun typecheck passes on current main. Co-authored-by: Cursor <cursoragent@cursor.com>
Whitelist and truncate AGENT_NOTIFY_SUMMARY auxiliary fields before JSON serialization; cap task-notification summaries to glance limit. Co-authored-by: Cursor <cursoragent@cursor.com>
10s AbortSignal.timeout on OAuth + FCM send so sequential web-push fallback is not blocked on hung Google endpoints; truncate Grep/Glob pattern in permission detail formatter. Co-authored-by: Cursor <cursoragent@cursor.com>
Delete stale fcm_devices rows sharing the same token when a native install registers under a different namespace. Co-authored-by: Cursor <cursoragent@cursor.com>
Add en/zh-CN keys for the Companion section title and CompanionPairing strings; matches locale-driven Settings pattern (bot Minor on tiann#803). Co-authored-by: Cursor <cursoragent@cursor.com>
Parse FCM error JSON: only UNREGISTERED or token-field INVALID_ARGUMENT unregister devices; generic NOT_FOUND stays transient. Guard limit<=3 in truncateReadyText so tiny action budgets cannot blow the glance cap. Co-authored-by: Cursor <cursoragent@cursor.com>
FCM v1 often returns HTTP 404 with root NOT_FOUND plus details[].errorCode UNREGISTERED; prune those tokens while keeping generic project/resource NOT_FOUND transient. Co-authored-by: Cursor <cursoragent@cursor.com>
…tion # Conflicts: # hub/src/fcm/fcmNotificationChannel.test.ts
…ation # Conflicts: # hub/src/socket/handlers/cli/sessionHandlers.ts # shared/src/schemas.ts
# Conflicts: # hub/src/store/index.ts # hub/src/store/types.ts
Soup-local fixup. Files are unchanged in upstream/main and typecheck
passes on upstream/main, but typecheck on the layered driver/integration
soup produces:
hub/src/tunnel/tlsGate.ts(70,37): TS2345 string | string[] -> string
hub/src/web/routes/guards.ts(46,46): TS2345 string | undefined -> string
Likely cause: a layered merge resolves @types/node (or a peer) to a
different transitive version that exposes the wider PeerCertificate.CN
type and the Hono c.req.param() string|undefined return. Both call sites
were already not-quite-safe at runtime - this layer adds the missing
narrowing without changing semantics:
tlsGate: typeof guard rejects multi-CN certs (was already the
practical behaviour - dnsNameMatchesHost would have stringified
the array and never matched any real DNS suffix).
guards: returns 400 with explicit message when the route param is
missing, instead of passing undefined through to requireSession
where it would have produced a confusing 404 cascade.
Not appropriate upstream as-is: upstream/main typecheck is clean
without these changes, and the fixes are paving over a soup-only
type drift, not a real bug in upstream code. Belongs in soup until
the underlying transitive dep / lockfile drift is identified and
either upstreamed or pinned.
Co-authored-by: Cursor <cursoragent@cursor.com>
Soup verify: mark optional Pi model/command fields .optional() so transformed undefined does not fail object parse under Zod 4.
Soup verify flake: fast in-memory updates can share millisecond with prior updatedAt; contract is monotonic (>=), not strictly greater.
…tion # Conflicts: # hub/src/fcm/fcmNotificationChannel.test.ts
…ation # Conflicts: # hub/src/socket/handlers/cli/sessionHandlers.ts # shared/src/schemas.ts
# Conflicts: # hub/src/store/index.ts # hub/src/store/types.ts
Persist overseer events in SQLite v11 (events, event_links, FTS5), record from assistant notify summaries with hub fallbacks, expose GET /api/system-events, and add a read-only settings debug pane. Includes db-prep v11→v10 downgrade and Playwright fixture smoke. Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Soup stacking: scratchlist and FCM layers also claim v11. Idempotent CREATE IF NOT EXISTS for all three substrates avoids rerere dropping tables when overseer merges last.
Events tables must not own SCHEMA_VERSION — ensureOverseerEventsSchema runs on every Store boot so v11-stamped soup DBs self-heal. Add events/event_links to REQUIRED_TABLES, fix content-storing FTS delete/update triggers, and extend db-prep with full soup v11 downgrade plus --drop-overseer-events. Co-authored-by: Cursor <cursoragent@cursor.com>
events.related_session_id FK blocked DELETE /sessions and reopen merge (delete old row). Detach on intentional delete; repoint to new session id on mergeSessions so overseer audit trail survives reopen/resume id swap. Co-authored-by: Cursor <cursoragent@cursor.com>
…tone (#22) Events embed payload.session (id, tag, name, project, flavor) at write time so audit rows stay self-describing after hard-delete. deleteSession snapshots identity into init-gated deleted_sessions before detach. Co-authored-by: Cursor <cursoragent@cursor.com>
Add init-gated inbox_items schema, per-session promotion from attention events, coarse-rank/oldest-within ordering, operator-action logging, REST + settings debug pane stacked on the #22 events substrate. Co-authored-by: Cursor <cursoragent@cursor.com>
💡 Codex Reviewhapi/hub/src/web/routes/systemEvents.ts Lines 33 to 36 in 20efb03 When an authenticated token belongs to another namespace, this endpoint still calls hapi/hub/src/web/routes/inboxItems.ts Lines 35 to 39 in 20efb03 In a multi-namespace hub, this endpoint lists Lines 395 to 397 in 20efb03 After an SSE reconnect, if a REST refetch has already populated the detail cache with a newer hapi/hub/src/cursor/cursorImporter.ts Line 318 in 20efb03 When a Cursor import cannot infer hapi/shared/src/sessionSummary.ts Lines 151 to 155 in 20efb03 Pi sessions store their native resume id as ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Stacked on the events substrate (Step 2). Adds the inbox_items table + promotion job + v0 hand-tuned priority scorer + explain_priority provenance string + read-only inbox viewer. This is the Step 2.5 layer in the Overseer build sequence; it builds on Step 2 (events) and is built on by fix/overseer-inbox-stale-noise (PR #54) and the Step 2.75 replay harness. Soup integrates the full stack; clean upstream PR awaits the stack-wide rebase onto upstream/main (drops garden) via the integration soup process.
Made with Cursor