feat(overseer): read-only Overseer entity + 7 query tools + voice route#56
feat(overseer): read-only Overseer entity + 7 query tools + voice route#56heavygee wants to merge 1 commit into
Conversation
Overseer is now a real conversational entity in the hub with a stable identity, system prompt, and a dedicated voice surface (consumes the existing stt/tts voice substrate; does not reimplement it). It is inform-only at this stage - no dispatch, no confirm, no state mutation. 7 read-only tools, all unit-tested against the live substrate: query_events, query_inbox, get_session_state, get_session_recent_output, get_worker_health, explain_priority (reuses stored reason_for_priority), list_active_workers. Worker health derives reported/observed/inferred state per contracts §2. Voice conversation turns are written back to the events stream as event_type=convo_turn for provenance. Store: additive queryEvents (project/sourceKind/severity/time filters) and inbox status/category filters - no changes to existing query shapes. Protocol layer (shared/src/overseerEntity.ts): identity, tool catalog with zod arg schemas, system prompt, worker-state derivation helpers, convo_turn builder. 24 new tests pass. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e21a3c584c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| getSession: (sessionId) => this.getSession(sessionId), | ||
| getSessions: () => this.getSessions() |
There was a problem hiding this comment.
Scope Overseer reads to the caller namespace
When an authenticated client calls any /api/overseer tool, the entity resolves sessions through the global getSession()/getSessions() accessors instead of the JWT namespace. In a multi-namespace hub, list_active_workers enumerates every tenant's sessions, and the returned IDs can then be passed to get_session_recent_output or get_worker_health to read another namespace's transcript and state. Pass c.get('namespace') into tool dispatch and use namespace-scoped accessors or reject cross-namespace session IDs.
Useful? React with 👍 / 👎.
| const event = engine.getOverseer().recordConvoTurn({ | ||
| operatorText: parsed.data.operatorText, | ||
| overseerText: parsed.data.overseerText, | ||
| relatedSessionId: parsed.data.relatedSessionId ?? null, |
There was a problem hiding this comment.
Validate convo-turn session ownership
When a caller includes relatedSessionId, this write path records it without resolving that session against c.get('namespace'), unlike the existing session routes that use requireSessionFromParam. If a namespace A user knows a namespace B session ID, the FK accepts it and the convo_turn is stored under B's session history, polluting another tenant's audit context. Validate ownership before recording the relation or drop it.
Useful? React with 👍 / 👎.
| return { | ||
| items, | ||
| candidates: items.filter((item) => item.status === 'new'), | ||
| surfaced: items.filter((item) => item.status === 'surfaced'), |
There was a problem hiding this comment.
Exclude Overseer turns from worker activity
When an Overseer conversation turn is saved with relatedSessionId, it creates a latest event for that session with sourceKind: 'overseer'. These health/state calculations fetch the latest event without filtering to worker or hub-observed activity, so asking the Overseer about a stale session makes lastActivityAt become the conversation timestamp and silenceMs near zero, masking stale workers in get_session_state, get_worker_health, and list_active_workers. Exclude convo_turn or overseer-sourced events from activity calculations.
Useful? React with 👍 / 👎.
| const lastToolCall = this.events.query({ sessionId, eventType: 'tool_call', limit: 1 })[0] | ||
| ?? this.events.query({ sessionId, eventType: 'tool_result', limit: 1 })[0] |
There was a problem hiding this comment.
Compare tool call and result timestamps
When a session has any tool_call event, this expression never looks at tool_result, even if the result is newer than the call it completed. lastToolCallAgeMs can therefore report an old start time immediately after fresh tool activity, misleading the Overseer about recency during long-running tools or after a just-finished command. Query both event types together or compare the two timestamps before selecting one.
Useful? React with 👍 / 👎.
|
|
||
| getSessionRecentOutput(sessionId: string, n = 10): OverseerRecentOutputChunk[] { | ||
| const limit = Math.min(Math.max(n, 1), 50) | ||
| const messages = this.messages.getMessages(sessionId, limit) |
There was a problem hiding this comment.
Fetch enough rows before filtering recent output
When the last n stored messages are non-text tool-call/tool-result records, getMessages(sessionId, limit) returns only those rows and the subsequent plain-text filter drops them, so get_session_recent_output returns fewer chunks or an empty list even though earlier recent transcript text exists. Fetch a larger window and then slice after filtering so the tool fulfills its “last N transcript chunks” contract.
Useful? React with 👍 / 👎.
| if (record?.role === 'user' && typeof record.content === 'string') { | ||
| return record.content |
There was a problem hiding this comment.
Parse structured user messages as transcript text
When operator messages use the normal web shape written by MessageService.sendMessage ({ role: 'user', content: { type: 'text', text } }), this branch only accepts string content and returns null, so get_session_recent_output drops the operator prompts from context. That leaves the Overseer seeing worker output without the instruction that produced it; extract content.text for text records before skipping the row.
Useful? React with 👍 / 👎.
| if (reported === 'blocked' || reported === 'failed' || reported === 'complete') { | ||
| return { state: reported, confidence: 0.9, note: `worker self-reported ${reported}` } |
There was a problem hiding this comment.
Let live pending requests override stale completion reports
When a session has a current permission request, deriveObservedWorkerState returns waiting_on_operator, but an older worker completed event still makes this branch return complete with 0.9 confidence. In that scenario get_worker_health tells the Overseer the worker is complete while there is an active operator decision pending; treat observed === 'waiting_on_operator' as a conflicting current signal before terminal self-reports.
Useful? React with 👍 / 👎.
| let body: unknown | ||
| try { | ||
| body = await c.req.json() | ||
| } catch { | ||
| body = {} |
There was a problem hiding this comment.
Reject malformed tool request bodies
When JSON parsing fails here, optional-argument tools run with {} instead of rejecting the request. A client that sends a malformed filtered query_events or query_inbox body receives the default unfiltered result set (up to 50 rows), which is surprising and can expose more operational context than intended. Return 400 on invalid JSON as the convo-turn route does.
Useful? React with 👍 / 👎.
Step 3 of the Overseer build sequence: read-only entity + tools + voice route
Stacked on
fix/overseer-inbox-stale-noise(events #22 + inbox #23 + stale-noise fix). PR diff is exactly the 12 files of this step.What this adds
query_events- events stream by session/project/type/severity/time/status/attention_candidatequery_inbox- candidates + surfaced + heldget_session_state- hub-observed state + last activity + tool-call recency + worker_reported_stateget_session_recent_output- last N transcript chunksget_worker_health- combined reported/observed/inferred (contracts §2)explain_priority- provenance trail, reuses the storedreason_for_priority(no reverse-engineering)list_active_workers- roster by project/state/ageconvo_turnwriteback: voice/text conversation turns recorded to the events stream for provenance.GET /api/overseer/voice): dedicated Overseer surface that consumes the existing stt/tts voice substrate (feat/overseer-stt-tts-endpoints,feat/overseer-voice-persistence) - does not reimplement it. Chrome-button relocation is Step 5 (out of scope).Store changes (additive only)
queryEventsadds project/sourceKind/severity/time filters; existing query shapes unchanged.listgainsstatuses[]+categoryfilters.Protocol layer
shared/src/overseerEntity.ts: identity, tool catalog with zod arg schemas, system prompt, worker-state derivation (mapNotifyStatusToWorkerState/mapEventTypeToWorkerState/deriveObservedWorkerState/inferWorkerState),convo_turnbuilder.Tests
24 new tests pass (protocol unit + OverseerEntity tools + routes). No regressions in existing substrate tests (
systemEvents,inboxItems,store).Known: pre-existing base breakage (not from this PR)
The souped base carries unrelated breakage from other peers' incomplete merges -
hub/src/fcm/fcmNotificationChannel.tsimports a missing../notifications/modelErrorCopy, plus aweb/src/hooks/useSSE.tstype error. These are the onlybun typecheck/bun run testfailures and are outside this PR's surface. My 12 files are type-clean and fully tested.Gating note
Persona + voice-answer-quality tuning is intentionally not in this PR - it gates on the replay harness (now landed on
feat/overseer-replay-harness). This PR is the independent plumbing.Made with Cursor