Release v1.0.0 — Live Sessions, Sharing & Skill Packs by DIodide · Pull Request #154 · DIodide/Harness

DIodide · 2026-06-23T05:38:45Z

Promotes staging → main as the v1.0.0 major release. Covers the entire unreleased span since the last tagged release (v0.2.1 → v1.0.0, PRs #81–#153). Full notes in CHANGELOG.md (entry added in this PR).

⚠️ Do not merge yet — opened for review (Ibraheem will run the code-review skill first). See the pre-merge devops checklist below.

Release highlights

Live session following — in-flight agent output fans out to every viewer of a conversation (other tabs, /workspaces, shared read-only pages) via a Redis Streams bus; interactive prompts, owner-only infra signals, and cost are never relayed to followers.
Rewind & fork — rewind / rewind-and-fork under any user message, mid-message rewind at step "seams" (toggle in Settings → Display), compaction summaries + clone-from-summary.
Chat & harness sharing + collaboration — shareable links (read-only view + fork), viewer/editor roles, real-time editor collaboration that runs on the owner's harness/credentials, a shared-page side panel, harness share-by-link/email + Clone, and a Manage Sharing page.
Skill Packs — reusable skill bundles with optional AGENTS.md/CLAUDE.md sandbox context, a full-page catalog/editor, and one-click owner/repo import (GSAP, Anthropic, Superpowers, Vercel templates).
Per-workspace agent sandboxes — a harness's ACP agent runs inside the workspace's own (reused) sandbox; default workspace per account; drag-and-drop reordering; bounded sandbox lifetime with self-heal.
Workspace credentials — named secrets (e.g. GITHUB_TOKEN) injected as env vars into a workspace's sandboxes; write-only, AES-256-GCM, Manage Credentials view.
Usage — per-credential agent usage by agent, authoritative cost + cache-token accounting, real Claude session/weekly rate-limit surfacing, honest budget labels.
Claude Code config — pre-session Mode/Model/Effort, harness-level defaults, opus[1m], Bypass Permissions mode, effort slider + background-agents panel, live workflow/subagent observability.
Reliability & integrity — send-path/stream-lifecycle hardening, faithful saved content, resilient auth guards (no more sign-in redirect loop), stale-chunk auto-recovery, list/fork index-backfill tolerance.

Plus a Security section: strict redacted projections for shared content, credential plaintext that never leaves the server, reserved-name protection, host-pinned skill-repo import.

Versioning

Cut as git tag v1.0.0 on the main merge commit after this PR is approved + merged (consistent with v0.2.0 / v0.2.1). Both package.json files already read 1.0.0; the tag is the release marker.
GitHub release notes = the new CHANGELOG.md entry.

Pre-merge devops checklist (NOT in the changelog — infra only)

These set up the Redis Streams backend for live-following on prod. The bus is fail-soft: with REDIS_URL unset, live-following silently no-ops and single-instance streaming works exactly as today — so the release is safe to merge before this is done; Redis only gates whether live-following actually fans out on prod.

Provision a Redis reachable from the prod FastAPI host (ElastiCache or a Redis service on the prod instance); ensure the security group allows the FastAPI host.
Set REDIS_URL in the prod FastAPI env (/opt/harness-api/.env) and restart harness-api.
Verify /api/chat/follow fan-out works on prod (owner tab + second viewer).
Any other prod env parity vs. staging (sweep before cutting the tag).

Scope notes

Changelog is user-facing only by request — Redis/AWS prod setup, CI, deploy plumbing, internal refactors, and test-only changes are intentionally excluded (verified by an adversarial coverage/leakage pass).

In-flight tokens previously rendered only in the tab that started the turn (local React state). Now every chat/agent turn ALSO tees its display events into a per-conversation Redis Stream, and any passive viewer (the owner's other tabs, a sharee watching, a late joiner) opens GET /api/chat/follow to replay the current turn + tail it live — rendering through the same ChatMessages props. - stream_bus.py: XADD display events (token/thinking/tool_call/tool_result/done/ error/plan/status...) with MAXLEN trim; interactive events (permission/ question) are NOT relayed. follow() replays from the live turn marker then BLOCK-tails. FAIL-SOFT: when REDIS_URL is unset every fn is a no-op and turns stream only to the initiator exactly as before (no regression, safe deploy). - chat.py/agents.py: wrap the SSE generators with stream_bus.tee; new GET /api/chat/follow authorized like a shared read (owner JWT OR editor/viewer grant token, incl. anonymous-with-token via optional auth). - web: useFollowStream hook (isolated reducer faithful to the provider) wired into chat/index (owner multi-tab + owner watching a sharee) and the share page (editor + read-only viewer). Initiator keeps its token-perfect local stream; never both. Solves owner<->sharee AND multi-tab same-account. Shared Redis also makes fan-out work across multiple FastAPI workers/boxes. FastAPI 236, web 181; types/biome clean.

MUST-FIX: - M2 (crash): hoist useFollowStream above chat/index's early return — a hook after an early return crashed /chat on every load (rules-of-hooks). - M1/S1/S2 (secret leak): drop mcp_error + sandbox_status + agent_usage from FOLLOW_EVENTS (owner MCP url / sandbox id / agent cost) and strip usage/cost from the relayed 'done' frame — followers see only the transcript they can already see persisted. - M3 (handoff): prefer follow else local (covers the post-done window so the finished bubble never flickers); onStreamSynced now clears local state AND the follow bubble + drains the queue (was orphaning local state + stalling queued messages every turn). - M4 (liveness): time-box each bus op (1s) with a per-turn breaker so a hung Redis can never stall the initiator's turn. SHOULD-FIX: - S3: /follow now 403s a fully anonymous caller with no token (don't reach the Convex dev fail-open). - S4: follow() leads every replay with a synthetic turn_start so a reconnect after MAXLEN trim can't double-append. - S5: max_connections=256 for concurrent follower XREADs. S6: 6h stream TTL so a long turn can't self-expire its key. FastAPI 238, web 181; types/biome clean.

feat(stream): live token fan-out to all viewers via Redis Streams

Two operations under every user message, on both the normal Harness (OpenRouter) and Claude Code (ACP), via one thin generic seam: - Rewind (in place): truncate the thread to that user message — keep it, delete everything below. Destructive, so it confirms inline on first click; does NOT auto-stream (re-sending is the user's choice). - Rewind & fork: branch a new conversation at that point, original intact. ~80% reuse: - messages.removeAfter: new EXCLUSIVE truncation (keeps the target) vs the existing inclusive removeFrom used by regenerate. Same owner/editor-token auth. + tests. - Rewind-and-fork reuses conversations.fork (already copies [0..target]). - Claude Code session rewind reuses the existing forget-and-recreate path: resetAgentSessionForRewind() forgets the cached session so the next prompt opens a fresh one that re-seeds from the truncated history. No new gateway endpoint. No-op for the stateless OpenRouter loop. This one seam is where future per-agent rewind handling (Codex/Cursor) lands. - MessageActions/ChatMessages gain onRewind/onRewindFork (every user message, owner-only for v1); both routes wire them. v1 scope notes: owner-only (collaborator rewind via token = follow-up); rewind is conversational-only (the Daytona sandbox filesystem is not rewound).

Adversarial review (1 critical, 2 major, 1 minor) — all fixed: - CRITICAL: client-only forgetAgentSession did NOT rewind Claude Code — the gateway dedups sessions by (user, conversation, agent) and returns the warm session (non-empty transcript → seed-from-history skipped; ACP session in the sandbox still holds the rewound turns). So the agent kept acting on deleted turns. Fixed with a real server reset: AgentSessionManager .reset_conversation + POST /sessions/by-conversation/{id}/reset tears the session down (parking the runtime warm) so the next prompt opens a fresh session that re-seeds from the truncated history. - MAJOR + MINOR (agent-key): the reset is keyed by CONVERSATION, not agent, so a session-only agent override can't be missed; the client also forgets ALL cached sessions for the conversation (forgetAllAgentSessions). - MAJOR: removeAfter now deletes by POSITION in the canonical by_conversation order (findIndex + slice) instead of a `>` _creationTime compare, so same-millisecond siblings can't leak. resetAgentSessionForRewind is now async (server call + cache clear); both routes await it with the Clerk token. Tests: reset_conversation (gateway), position-based removeAfter (convex).

…ed into CI Real-Redis integration tests (skip cleanly when no Redis; CI provides one): - fan-out to two followers (byte-identical), live tail, late-joiner replay, cross-conversation isolation. - the FOLLOW_EVENTS allowlist (owner mcp/sandbox/cost + interactive events never reach a follower) and done.usage stripping. - tee() fail-soft: a RAISING bus still delivers the complete turn to the initiator; a HUNG bus is bounded by the 1s breaker (no per-event stall). - MAXLEN-trim replay still leads with the synthetic turn_start reset. - /api/chat/follow auth: anonymous-no-token + invalid-bearer → 403; an authorized viewer gets 200 text/event-stream with frames relayed. scripts/stream_smoke.py: a runnable fan-out demo (two watchers + a late joiner) that prints tokens streaming down and asserts everyone reconstructs the message. CI: the FastAPI job now runs a redis:6.2-alpine service (matches prod redis6.2) with REDIS_URL_TEST so the integration tests run there. Addresses the test-quality review (circuit-breaker + SSE-relay coverage gaps, hardened the one timing-flaky test to an Event gate). FastAPI 252, 5x stable.

test(stream): Redis fan-out integration tests + smoke client + CI

The workspaces-first chat view (the primary UX) renders its own ChatMessages but never subscribed to the fan-out, so a passive tab there showed only persisted messages, not the in-flight stream. Mirror the /chat wiring: hoist useFollowStream above the early return, prefer the follow feed (else local) for the active conversation's stream state, and clear both on sync. All three chat-rendering routes (chat, workspaces, share) now follow.

fix(stream): wire live-follow into the /workspaces route

…sponse (re-review) Re-review (0 critical, 1 major, 2 minor fixed; 2 minor deferred): - MAJOR: reset_conversation could tear down a session mid-turn. Now only resets IDLE sessions (status=="ready", turn_guard==0, lock free) — the same guard the reaper/_claim_parked use. The client already blocks rewind while streaming; this is the server-side backstop. - MINOR: resetServerAgentSessions now checks response.ok (api() resolves on 4xx/5xx) and retries once, returning success — a swallowed reset error would otherwise leave the stale session reusable with the rewound turns. Deferred (documented, narrow): a sub-second send-during-rewind TOCTOU race (needs composer-send gating); the replay preamble's harness-switch wording shown for a rewind (cosmetic).

Rewind (in-place truncate) + rewind-and-fork (branch) for normal Harness + Claude Code, via a server-side conversation reset. Adversarially reviewed twice; all critical/major findings fixed.

…es parity The rewind + fork-at-message handlers (handleRewind, fork-at-message, the removeAfter/fork mutation bindings, the in-flight guard, the agent-session reset) were duplicated verbatim in both routes. Extract them into hooks/use-rewind.ts so the two routes share one implementation and stay at parity. The hook reads the stream context + Clerk token itself; callers pass only the active conversation id + a navigate callback. forkAtMessage now backs both the assistant "Fork" and the user "Rewind & fork" (one handler). No behavior change.

…ting (#118) Owner-redirect, an undeletable Default workspace, and a fork workspace picker — so shared chats open in the right place under workspaces mode. A. Owner opening their OWN share link now respects workspacesMode: it lands in /workspaces?workspaceId&convoId (the conversation's own workspace) instead of the deprecated /chat?convoId. Legacy workspace-less conversations are adopted into the Default workspace (conversations.ensureInWorkspace, which also re-stamps the messages' workspaceId so workspace-scoped content search finds them). Falls back to /chat for users whose workspacesMode !== "workspaces". B. Every account gets an undeletable "Default" workspace (isDefault flag). getOrCreateDefaultWorkspace returns the flagged default, else adopts an existing "Default"-named workspace, else creates a fresh one — it never force-flags an arbitrarily-named workspace, so existing custom workspaces stay deletable. workspaces.remove rejects the Default; the management UI hides its delete button. Its harness/sandbox stay editable. C. Forking a shared chat opens a workspace picker; the sharee chooses which of THEIR workspaces to fork into (default = their Default). forkSharedConversation validates the chosen workspace belongs to the actor and stamps the fork + copied messages with it (owner usage/cost stripped). Landing is mode-aware (/workspaces vs /chat). The /workspaces route gained a convoId deep-link param. The deep-linked conversation is now mirrored into the URL (so refresh/bookmark reopens it) and is no longer wiped by the workspace-init effect while the workspaces list is still loading. Includes Convex tests for the Default lifecycle, message re-stamping, and fork workspace placement. Addresses findings from an adversarial review of the diff.

Extract shared useRewind hook so /chat and /workspaces stay at rewind/fork parity. No behavior change; 185 tests pass.

… + /workspaces Rewind / rewind-&-fork into the MIDDLE of an assistant message at part boundaries (text / reasoning / tool_call), not just whole messages. Backend: - messages.truncatePart: keep the first N flat parts of an assistant message, recompute `content` from kept text parts (mirrors the gateway's "".join(text_parts)), clear legacy reasoning/toolCalls, delete every later message — patch + delete in one transaction. - conversations.fork gains truncateLastPartCount: copy the boundary assistant message TRUNCATED. Non-destructive (original untouched, no live session to reset) — the safe primary for mid-message. - Shared contentFromParts helper so the two paths can't drift. Frontend: - message-seams.ts (pure, unit-tested): seam geometry over the FLAT parts array, mirroring organizeParts' top-level numbering so a kept tool call keeps its whole subagent subtree (no orphans). summarizeDropped flags whether the agent's context actually changes (text dropped) vs view-only (only trailing reasoning/tool_call dropped). - AssistantParts + Seam: hover-revealed seams in the gaps between rendered blocks; hovering dims the blocks below (preview); clicking opens an inline confirm with Rewind & fork (primary) / Rewind (destructive) / Cancel and honest consequence copy. - useRewind gains handleRewindToPart (in-place, resets agent) and forkAtPart; wired through ChatMessages into both /chat and /workspaces. Tests: 7 convex (truncatePart + fork-truncate), 9 frontend (seam geometry).

Adversarial review found 1 critical + 3 major + 4 minor. Fixes: CRITICAL/MAJOR — degenerate seam (keep === parts.length, e.g. an interleaved background subagent whose child is the last flat part) was rendered but threw in-place (truncatePart out-of-range) and silently no-op'd on fork. Now gate seams on hasRenderableAfter(parts, keep) — only show a seam that actually drops a visible block; never the last block, a no-op cut, or an empty-parts-only cut. Also wrap handleRewindToPart/forkAtPart in try/catch with an error toast. MAJOR — "agent's context unchanged" copy could lie. (a) Seams now render ONLY on the last message in the thread, so truncation never silently deletes later turns. (b) contentChanges is now computed by COMPARISON (recomputed content !== stored content) instead of "did a text part drop", so it is honest even on the default OpenRouter path, which stores only the last agentic iteration's text while parts[] holds one text part per iteration (chat.py). Added a frontend contentFromParts mirror + a convex test for the divergent-content case. MINOR — describeDropped pluralizes on != 1 ("0 blocks"); empty/non-rendering trailing parts no longer offer a seam (hasRenderableAfter); inline confirm now moves focus to its primary button on open and dismisses on Escape; open-confirm identity + dim preview lifted into AssistantParts (one open at a time, dim pins to the open seam). Doc comment in messageParts.ts clarifies the OpenRouter divergence. NOTE (follow-up, not in this PR): chat.py persists last-iteration-only content for multi-iteration OpenRouter turns — a pre-existing fidelity gap independent of rewind. Recommend making contentFromParts the single source of truth there. Tests: convex 40 (+1), frontend 200 (+6 seam). tsc baseline, biome clean.

…change Verification re-review found a regression from lifting open/hover state into AssistantParts: clicking a seam unmounts the trigger button while the cursor is over it, so its onMouseLeave never fires and hoverIdx stays pinned — after onClose (which only cleared openIdx) the dim fell back to hoverIdx and stuck after every Cancel/Fork/Rewind. - Clear hoverIdx on both onOpen and onClose so the dim never sticks. - Clamp activeKeep against topCount and reset openIdx/hoverIdx when parts[] change in place (background subagent append), so block-index state can't outlive the geometry it indexes.

Root cause of the deep-link bug fixed earlier: the /workspaces route had no single owner for selection state. activeConvoId/activeWorkspaceId were two bare useState cells written by ~6 independent effects + 2 referee refs, with the precedence rule (URL deep-link > explicit selection > most-recent restore) living only in prose comments and emergent effect ordering. That tangle is how a load-order race shipped (a URL-seeded conversation got nulled while workspaces.list was still loading). It was a regression-via-reimplementation — the deprecated /chat route already had the safe pattern, but /workspaces reimplemented the contract divergently. Extract the tangle into cohesive, unit-tested units: - hooks/use-workspace-selection.ts — owns activeWorkspaceId/activeConvoId, workspace resolution, and selectWorkspace, with the precedence made explicit and documented. (8 tests, incl. the original-bug regression: a URL-seeded conversation survives workspaces.list loading.) - hooks/use-recent-chat-restore.ts — owns the restore arm/apply handshake. Adds cancelRestore() so an explicit "New chat" cancels an armed-but-unapplied restore (fixes a real latent bug: a just-dismissed chat could silently reopen), plus a defensive workspace-match guard. (6 tests.) - lib/navigate-to-conversation.ts — openConversation() centralizes the mode-aware (/workspaces vs /chat) routing so convoId is ALWAYS carried; this rule was copy-pasted and a copy that dropped convoId was the second half of the original bug. (5 tests.) workspaces/index.tsx shrinks ~110 lines; share/$token.tsx uses openConversation at the owner-redirect and fork-confirm sites. Behavior-preserving otherwise (verified by an adversarial diff review: 0 real regressions; every confirmed finding was a positive equivalence check or a deliberate fix). Follow-ups identified by the review, deferred to keep this PR scoped to the selection root cause: extract useMcpHealthCheck, useMessageQueue, and a unified fork flow (each duplicated near-verbatim in the deprecated /chat route).

Rewind / rewind-&-fork into the middle of an assistant message at part boundaries. Two adversarial review rounds; CI green.

…nRouter save paths The default OpenRouter path stored a non-faithful assistant `content`: only the LAST agentic iteration's text on a normal finish, and "" at max-iterations, while parts[] held one text part per iteration. This broke the invariant the mid-message-rewind feature relies on (content == contentFromParts(parts)) and meant prior assistant turns were represented to the model — and indexed for search — by only their last paragraph. - Add content_from_parts(parts) helper mirroring the TS contentFromParts (convex/messageParts.ts) and the ACP gateway "".join(text_parts) (session_manager.py): text parts only, joined with no separator. - Normal save persists content_from_parts(all_parts) (the full multi-iteration join) instead of last-iteration collected_content. - _save_interrupted is now self-reconciling: it appends the in-flight text as a trailing part IFF it isn't already the last text part, then derives BOTH the persisted content and parts from that reconciled list — so content == contentFromParts(parts) holds with no text lost or duplicated at every interrupted site (mid-stream exception, truncation abort, max-iterations). - The done event now sends the SAME faithful_content. Required, not cosmetic: the frontend's convexHasMessage handshake clears the streaming bubble only when lastMsg.content === pendingDoneContent (chat-messages.tsx:743-746); if the persisted join and done.content diverged, a multi-iteration bubble would never clear. Unchanged (correct as-is): streaming delta events and the intra-turn messages.append that build this turn's running OpenRouter message list. Implemented + adversarially verified via workflow (8 lenses: multi-iteration, single/tool-only, mid-stream in-progress-text, truncation/max-iter, the content==parts invariant, done-event/client, consumers, tests/lockfile). Follow-up (not in this PR): conversations/messages.saveInterruptedMessage trusts client-supplied content without recomputing from parts — the one remaining path not covered by the invariant. Tests: +11 (content_from_parts ×7, _save_interrupted reconcile ×4). Full fastapi suite 263 passed / 11 skipped; ruff clean; uv.lock untouched.

content_from_parts as single source of truth in chat.py; self-reconciling _save_interrupted; done-event handshake preserved. CI green.

…rruptedMessage The last persistence path that trusted client-supplied content. When parts are present, recompute content from them (contentFromParts) instead of storing the raw arg — so the invariant the mid-message-rewind feature relies on holds even if a caller sends mismatched content. Falls back to the raw content only when no parts were captured. Safe w.r.t. the convexHasMessage handshake (lastMsg.content === pendingDoneContent, chat-messages.tsx:743): the streaming client keeps state.content in lockstep with its text parts (onToken appends to both; onThinking/onToolCall never touch content), so contentFromParts(parts) equals the content the client sent and the bubble still clears. Tests: +3 (recompute ignores divergent client content; fallback with no parts; auth rejection). convex messages suite 24 passed.

…rruptedMessage Closes the last persistence path not covered by the content/parts invariant. Safe vs the streaming handshake. CI green.

…om-summary Gives developers observability into context compaction and agency over how to continue: see each compaction summary inline, and on a compacted conversation choose "continue full chat" (today's behavior) or "new session from summary" (a fresh clone seeded with the summary instead of the bloated transcript). Capture (verified live on claude-agent-acp@0.44.0 via a standalone ACP probe): - emitRawSDKMessages already opted-in; add `compact_boundary` + `user` filters. compact_boundary carries metadata (trigger, pre/post tokens); the summary prose arrives as a synthetic user message (string content) that the adapter drops from session/update but forwards raw — detected by Claude Code's canonical "This session is being continued…" preamble. - parse_sdk_compaction + _handle_sdk_compaction merge boundary+summary into one `compaction` SSE event, gated on a real boundary to avoid false positives. - Persisted mid-turn (survives SSE disconnect) via save_compaction → compactions:record (internal mutation; owner derived server-side). Data model: new append-only `compactions` table (kept out of messages.parts so forks never copy it) + `seededFromCompactionId` on conversations. Clone: compactions.cloneFromCompaction forks a conversation carrying the harness/workspace forward (reusing fork lineage) and seeds one summary message; _build_replay_preamble detects the single summary seed and replays it in full with summary framing (not the 4000-char harness-switch truncation). UI: CompactionPanel (query-driven, no schema part change) renders each compaction as a collapsible card + the continue-vs-clone banner, in both the chat and workspaces routes. Tests: parse_sdk_compaction + replay-preamble branch (11 new). All suites green: pytest 263, convex 161, frontend 218.

Self-review (the workflow review was down on API 529s): - CompactionPanel is keyed by conversationId so its local 'dismissed' state resets when switching conversations (ChatMessages isn't remounted per convo). - Clear session.pending_compaction at turn start so a boundary with no following summary can't be paired with a later turn's user message that merely echoes the compaction preamble.

Capture Claude Code compaction summaries over ACP (verified live), persist to a new compactions table, surface them inline, and offer continue-full-chat vs new-session-from-summary. Required CI green; adversarial self-review done (workflow review blocked by API 529 overload).

Adds a `rewindSeams` user setting (default on) controlling whether the mid-message rewind seams render. Gated in chat-messages.tsx (`seamsEnabled` prop) and wired from userSettings in both /chat and /workspaces. Toggle lives in the settings dialog under Display. Gates ONLY the seams — whole-message rewind/fork are unaffected. - schema: userSettings.rewindSeams (optional bool; absent = on) - userSettings get/update + DEFAULTS - settings-dialog: "Mid-message rewind" checkbox - chat-messages: seamsEnabled gate; both routes pass userSettings.rewindSeams Tests: userSettings round-trip + updated default-shape assertions (10 pass). Frontend 218 pass; tsc baseline; biome clean.

…indings) Exhaustive audit of seams across 87 cases found 3 issues (all in the rewind action handlers); the geometry, gating, and the new setting verified solid. - MAJOR: destructive in-place Rewind silently swallowed agent-session-reset failure. resetServerAgentSessions returns false (never throws) on a genuine network/5xx failure, so the try/catch couldn't see it — Convex truncated but the warm ACP session kept the rewound turns (silent view↔agent desync). resetAgentSessionForRewind now returns that boolean; both in-place paths warn the user (and suggest fork) when the reset fails. No false alarm for the stateless OpenRouter case (200, 0 sessions → ok). - MINOR: forkAtPart had no isBusy() guard — in the post-stream pendingDone window the seam targets the prior turn, so a fork could silently omit the just-finished turn. Added the guard (parity with handleRewindToPart). - MINOR: in-place Rewind silently no-op'd when busy. Both paths now toast "Can't rewind/fork while the turn is finishing." instead of returning silently. Frontend 218 pass; tsc baseline; biome clean.

…upted-turn hooks (#124) Three blocks were copy-pasted (near-)byte-identically in both the /chat and /workspaces routes. Extract them into shared, testable hooks so the two routes stop drifting and ~270 lines of duplication per route are removed: - hooks/use-mcp-health-check.ts — was byte-identical in both routes. Owns the mcpHealthStatuses state + runHealthCheck + refreshHealth + the URL-keyed effect; reads useAuth internally; returns { mcpHealthStatuses, refreshHealth }. - hooks/use-persist-interrupted-turn.ts — the persistInterruptedTurn body. Reads the chat-stream context; takes a getFallbackModel callback so it stays harness-agnostic (model fallback = state.model ?? getFallbackModel()). - hooks/use-message-queue.ts — the send-while-streaming queue (enqueue/dequeue, send-now interrupt+flush, drain-after-turn, post-sync processing, post-stream drain effect). The route keeps the route-specific sendQueuedMessage (passed in) and a residual effect that clears MCP-failure banners on convo switch (split out of the old combined clear effect). 7 unit tests cover the queue mechanics. Behavior-preserving: identical logic relocated; only the convo-switch clear was split (queue vs MCP banners) and handleStreamSynced moved after the hook call so it can consume processQueuedAfterSync. Full frontend suite green (225); zero new type errors. Header/SandboxSelector dedup and fork unification stay deferred (real structural drift / different backends).

User setting to toggle seams (default on, gates only seams); 87-case audit + 3 action-handler fixes (agent-reset desync warning, busy-window guards). CI green.

Mirrors the chat-share architecture for harnesses, and adds a Manage Sharing page (reached from the bottom-right sidebar rail) for all of a user's shares. Share Harness: - harnessShareGrants table (public link | email invite | bound user), mirrors shareGrants 1:1 (auth ALWAYS via an active grant; secrets never denormalized). Lock is a single `sharedLocked` flag on the harness. shares.ts helpers (isActiveGrant generic, clamp*, token-min, avatar hosts) exported + reused. - harnessShares.ts: owner mgmt (ensure/rotate public link, invite-by-email, role, lock, revoke, unshare, listings); a chromeless public viewer query (getSharedHarness) behind a REDACTED projection (no authToken / mcp url / agentCredentialId / sandbox ids / ownerUserId — a test asserts the denylist); cloneSharedHarness (drops every secret); editSharedHarness (editor + unlocked only, non-secret fields); listIncomingSharedHarnesses for the recipient. - Email "bind later": invite stores granteeEmail (an invite POINTER, never an auth key); FastAPI POST /api/harness-shares/claim resolves the caller's Clerk-VERIFIED emails (server-side) and binds via bindHarnessGrantsInternal. - Frontend: /share-harness/$token chromeless viewer (redacted config + clone, signed-in/out), HarnessShareDialog (public link + email + lock + roles), a "Share" card action, a "Shared with you" section on /harnesses (clone + an editor-only edit dialog the lock gates), claim-on-mount. Manage Sharing: - /manage-sharing route + a "Sharing" MANAGE_TABS entry (auto-adds the header tab AND the bottom-right rail icon). shares.listMySharedConversations (new by_owner index, backfill-tolerant) lists shared chats with revoke / change-role / stop-sharing; shared harnesses listed too. Live-run on the owner's account is DEFERRED (the plan showed agent-mode is structurally incompatible with the current session-ownership model and default-loop live-run needs its own focused PR); clone fully covers using a shared harness today. Tests: +11 Convex (redaction denylist, owner gating, clone secret-drop, lock/ role on editSharedHarness, email bind-later, listings), +3 FastAPI (claim). Convex 206, FastAPI 326, web 235; biome clean, tsc 21/21 baseline.

Restructures the README screenshots per review: a single characteristic shot of the chat app (an agent doing real work — MCP doc lookup, terminal build with an "exit 0", file edit, a subagent, and a Workflow card, in a colored workspace) sits at the top; everything else moves into a collapsed <details> gallery (subagent + workflow card, Context7 MCP connected, harnesses grouped by status, the harness editor, the share dialog). Drops the older chat-view.png. Captured loginless via dev-auth against a real deployment with seeded data.

No secret-leak or cross-tenant holes — redaction + authz model held. Fixes: - [MED] listIncomingSharedHarnesses dedup was role-blind: a user holding both a viewer and an editor grant on one harness could be shown the viewer card (Edit hidden) though resolveHarnessRole grants editor. Sort editor-first before dedup so the card reflects the strongest active grant. - [MED] Owners couldn't change an email/bound recipient's role (only the public link had a toggle). Wire setHarnessShareRole on recipient rows in both the dialog and the Manage Sharing harness section. bindHarnessGrantsInternal now MERGES instead of duplicating when a user is invited twice (keeps the stronger role, one grant per (harness,user)). - [LOW] sharedLocked persisted after unshareHarness → a later re-share silently started locked. Clear it on unshare. - [LOW] Reconcile the clone-vs-public-view URL policy: clone keeps the MCP url (the recipient's own copy needs it); fix the viewer copy to claim only "Credentials stay private" (the anonymous view still withholds urls). - [LOW] Clear the clone-resume intent before the attempt so a failed resumed clone can't silently re-fire on reload. - [LOW] Remove a stale "don't invite yourself" comment (enforced at bind) and update "three manage screens" comments after the 4th (Sharing) tab. +1 Convex test (viewer+editor → editor, merged to one grant). Convex 207, FastAPI 326, web 235; biome clean, tsc 21/21.

The /harnesses email-claim relay keyed on a component-local useRef, so it re-fired a Clerk lookup + bindHarnessGrantsInternal on every navigation back to the page. Gate it on a sessionStorage key instead (cleared on failure so it retries next visit) — once-per-session, matching the clone-intent pattern.

…files A Skill Pack bundles a set of skills with optional AGENTS.md / CLAUDE.md context. Attach multiple packs to a harness (instead of, or alongside, loose skills). For agentic (ACP) harnesses the gateway writes the context to the sandbox root and materializes each skill's SKILL.md so Claude Code loads it. New 'Skill Packs' manage screen (sidebar icon + /skill-packs route) to create/edit/delete packs; a pack picker in the harness create + edit flows. Convex: - skillPacks table + CRUD (skillPacks.ts); harnesses carry skillPackIds (create/update/duplicate/resolveForCollab); remove() detaches harnesses. - internalQuery resolveForGateway: concatenates AGENTS.md/CLAUDE.md per pack, unions skills (de-duped), joins cached SKILL.md — owner-scoped. FastAPI: - HarnessConfig.skill_pack_ids; resolve_skill_pack_context (Convex client). - session_manager._attach_skill_pack_context writes AGENTS.md (all agentic), CLAUDE.md + optional @AGENTS.md import (claude-code), and ~/.claude/skills/<slug>/SKILL.md (claude-code), on create AND revive. Path-slug sanitized; best-effort (never blocks a session). - Re-provision PRUNES previously Harness-managed context (sentinel-marked) before writing, so removed skills / detached packs clear from persistent sandboxes without touching user-authored files. - Default OpenRouter loop (chat.py) unions pack skills into its skill manifest. Frontend: - manage-tabs Skill Packs nav; /skill-packs route (list + editor reusing the skill picker + AGENTS.md/CLAUDE.md + @import checkbox); SkillPackPicker; harness-stream sends skill_pack_ids; onboarding + harness-edit attach packs. Adversarially verified (multi-agent): cross-user access + path traversal are defended; fixed the stale-context lifecycle bug found in review. Tests: Convex 203, FastAPI 341 (incl. new skillPacks/context-injection/prune tests), frontend 235; tsc/biome/ruff clean on changed files.

Drove a live Claude Code agent (in a real Daytona sandbox) and captured genuine output, replacing the previously-seeded transcript mockups: - hero: a real session exploring a sandbox — actual tool calls, a terminal run with exit 0, and the real result (node v22.22.3, primes 2..29). - background-agents panel with two real subagents running in parallel. - a multi-step run showing the plan, edits, and a passing test suite. - an approval card from a real run (the agent asking before a write). Keeps the MCP / harnesses / harness-edit / share shots. Hero stays at the top; the rest remain in the collapsed gallery.

… files Adds Skill Packs: a creatable entity bundling skills + optional AGENTS.md/CLAUDE.md context, attachable to harnesses. For agentic harnesses the gateway writes the context to the sandbox root and materializes SKILL.md files (sentinel-guarded pruning clears removed skills/packs). New /skill-packs manage screen + harness-flow picker; default loop unions pack skills. Adversarially verified; cross-user access + path traversal defended.

The code review already re-runs on every push (synchronize), but there was no way to request a fresh review WITHOUT pushing. Add a workflow_dispatch trigger (pr_number input) so it can be re-run on demand from the Actions tab / `gh workflow run claude-code-review.yml -f pr_number=<n>`; the job derives the PR number and checks out the PR head for either event.

Cut the Features section to ~a third — name each feature, drop the specifics.

docs: launch-ready README + GPLv3 LICENSE

- [LOW] The claim-once sessionStorage gate was cleared only on a fetch reject (network error), but the endpoint returns 200 on soft-failures and 401 on the post-sign-in token race — so a failed claim was suppressed for the whole session. Now: _verified_emails returns None on a TRANSIENT Clerk failure (vs [] for genuine no-emails) → the endpoint reports {ok:false}; the client clears the session key on any non-success (res not ok OR body.ok false) so it retries next visit, while a real success (incl. ok:true/bound:0) stays claimed. - [LOW] editSharedHarness skipped the systemPrompt 4000-char cap the owner's harnesses.update enforces, letting a less-trusted editor write unbounded data into the owner's doc. Export + apply assertSystemPromptLength, and clamp name. +1 Convex test (oversized systemPrompt rejected) +1 FastAPI test (transient → ok:false, no bind). Convex 208, FastAPI 327, web 235; biome clean, tsc 21/21.

… manage-tabs conflict

feat(sharing): share harnesses + Manage Sharing page

…port, reliable materialization UX: the skill-pack editor was a modal with a NESTED catalog modal. It's now a full page (routes/skill-packs/new + $packId render a shared SkillPackEditor: left = form, right = embedded catalog panel — no nested modals). The list page is list-only and navigates to those routes. Bulk import: convex/skills.ts importSkillRepo lists a repo's skills via the GitHub trees API, fetches + caches each SKILL.md (+ the repo's AGENTS.md / CLAUDE.md), indexes them, and returns them to drop into a pack. The editor adds an 'import a GitHub repo' input (e.g. greensock/gsap-skills) and four pre-built templates. Repo input is validated/host-pinned (no SSRF); './..' rejected. Reliability (the deep dive): the ACP gateway used to SILENTLY SKIP skills whose SKILL.md wasn't cached, with no fallback — so a freshly-added skill didn't reach Claude Code until ensureSkillDetails happened to finish. Fix: extracted the GitHub fetch into app/services/skill_content.py (fetch_skill_md), refactored chat.py to share it, and the gateway now back-fills missing SKILL.md on demand — bounded by an 8s budget + a 20-skill cap so it can never stall provisioning, and authenticated via GITHUB_TOKEN (5000/hr vs 60/hr). Distinct skills sharing a trailing id no longer collide on one ~/.claude/skills dir. Adversarially verified (multi-agent): chat.py refactor is faithful; nested-modal gone; gsap path confirmed against the live repo; fixed the two operational majors (unbounded back-fill, unauthenticated fetch) the review found. Tests: new test_skill_content.py + slug/cap tests; Convex 203, FastAPI 350, frontend 235; tsc clean (no new errors).

…t, reliable materialization Replaces the nested-modal skill-pack editor with full-page routes + embedded catalog; adds importSkillRepo (bulk import an owner/repo + its AGENTS.md/CLAUDE.md) and pre-built templates; and makes skills reliably reach Claude Code (gateway back-fills uncached SKILL.md from GitHub, bounded + GITHUB_TOKEN-authed, via a shared skill_content.py also used by the default loop). Adversarially verified.

…ct GitHub rate limit Clicking a skill-pack template/import showed a cryptic '[CONVEX A(skills:importSkillRepo)] Server Error' on staging. Root cause: repo skill discovery uses the api.github.com git/trees endpoint, which is 60/hr per IP when unauthenticated. With no GITHUB_TOKEN set on the deployment (shared IP), that call is rate-limited -> discovery returns empty -> the action threw a plain Error, which Convex masks as a generic 'Server Error'. (ensureSkillDetails works token-less because it hits raw.githubusercontent.com first; only discovery needs the rate-limited API.) The real fix is operational — set GITHUB_TOKEN on the Convex deployment (the code already reads process.env.GITHUB_TOKEN). This commit makes the failure diagnosable instead of cryptic: - listRepoSkillIds now reports rateLimited (403 w/ x-ratelimit-remaining=0, or 429) vs notFound (404/403). - importSkillRepo throws ConvexError (Convex surfaces these; plain Errors are masked) with actionable messages: invalid repo / rate-limited+set-GITHUB_TOKEN / repo-not-found / no-skills. - the editor's import catch reads ConvexError.data so the real reason reaches the toast. Verified (multi-agent): root cause confirmed; the with-token happy path has no other throw and breaches no Convex limit (8 or 60 skills); ConvexError surfaces correctly. convex tsc clean; no new web type errors.

…te limit importSkillRepo's repo discovery hits the unauthenticated api.github.com rate limit when no GITHUB_TOKEN is set, and the plain Error was masked as a generic 'Server Error'. Now throws ConvexError with actionable messages (rate-limited -> set GITHUB_TOKEN / repo-not-found / no-skills); the editor surfaces ConvexError.data. The operational fix is to set GITHUB_TOKEN on the deployment.

feat(usage): real Claude subscription usage bars (5h + weekly)

Follow-up fixes from a recall-mode review of the harness-sharing feature (#146). Correctness: - editSharedHarness: a less-trusted editor could BLANK the owner's name/ model by saving an empty field (only an upper bound existed). Empty/ whitespace name/model are now ignored, and the Edit dialog disables Save when either is empty. - harnesses.remove: cascade-delete the harness's harnessShareGrants. Orphaned grants were un-revokable (revoke re-asserts ownership via the now-deleted harness) and a stale public token kept resolving to a dangling id. - publicHarnessProjection.hasAuth: derive from authType (!= "none") instead of Boolean(authToken) — oauth/tiger_junction servers DO require auth but keep their secret off the harness row, so the viewer wrongly showed "no auth". - share-harness viewer: guard requestClone with an in-flight ref so the manual Clone button + auto-resume effect can't create two clones; clear the pending clone intent on the owner-redirect path so it doesn't linger for its TTL. - /harnesses: wrap the synchronous sessionStorage access in the claim effect in try/catch (a sandboxed/partitioned context threw and broke the page); and don't flash EmptyState while the incoming-shares query is still in flight. Cleanup: - listMySharedHarnesses now sorts newest-first, matching the adjacent listMySharedConversations on the Manage Sharing page. - editSharedHarness: drop the unused `grantId` arg (authz is resolved via the bound grant), which falsely implied the edit was grant-scoped. Adds regression tests for the empty-field guard and the remove cascade.

Addresses findings from an xhigh review of #150: - Scope the fetch to claude-code: `elif cred_id and session.agent_id == "claude-code"`. It was firing for ANY agent with a linked credential — _fetch_subscription_usage hardcodes "claude-code", so a codex/cursor cred raised + logged a stack trace (caught) roughly once a minute while streaming. - Treat an empty `{}` rate-limit snapshot like an absent one (`if rl:` instead of `is not None`): an empty dict no longer blocks the fetch or clobbers a good stored snapshot. - Dedupe the buckets write: only persist when the fetched snapshot changed (mirrors the flat path) — no redundant Convex write every ~60s on a long session. - Bound `_sub_usage_fetched_at`: prune entries past the TTL once it grows large, so the per-credential debounce map can't grow unbounded on a long-lived process. - Key the account-limit bars on a stable window id, not the human label, so two windows that fall back to the generic "Claude account" label can't collide. Skipped by design: the /v1/messages ping (the only way to read the unified rate-limit headers — count_tokens omits them, /api/oauth/usage is scope-blocked) and the wait-out-the-TTL-on-failure debounce (intentional, avoids hammering).

fix(usage): harden subscription-usage fetch (review follow-ups to #150)

…wups fix(sharing): address xhigh code-review findings on harness sharing

…153) Persistent-sandbox unification (per-workspace boxes) created sandboxes with auto_stop but no auto_delete, so Daytona stopped → archived → kept them forever. Leaked session-owned boxes (teardown missed on a gateway restart) and abandoned workspace boxes both piled up as archived sandboxes; archived boxes take ~3 min to wake, so once enough accumulate the whole Daytona account crawls and Claude Code sessions hang on cold start. - Set auto_delete_interval on creation (both ACP + code-exec create paths). Scratch (session-owned) boxes hold nothing durable → reclaimed in 1 day (before they even archive); persistent workspace/code-exec boxes hold the user's files → 14-day grace. The "continuously stopped" clock spans the archived period, so archived boxes do get reclaimed. Both intervals are env-tunable; non-positive clamps to "disabled" (Daytona reads 0 as "delete immediately on stop", a data-loss footgun). - Self-heal in _provision_once: a box Daytona auto-deletes still looks "owned" to verify_sandbox_owner (it checks Convex, not Daytona), so the next session would attach a ghost and error. On DaytonaNotFoundError for an attach, drop the stale Convex link and — for a workspace-unification box — create a fresh persistent one and relink. An explicit, user-chosen harness sandbox can't be fabricated, so that surfaces (link still cleared). - Tests: interval selection + clamp, and the missing-attach heal decision.

Release notes for the 1.0.0 major release covering the full unreleased span since v0.2.1 (PRs #81–#153): live session following, rewind/fork, chat + harness sharing & collaboration, Skill Packs, per-workspace agent sandboxes, workspace credentials, per-credential usage, Claude Code config, and reliability/integrity hardening. Devops/infra setup (Redis Streams prod provisioning, CI, deploy plumbing) intentionally excluded — user-facing changes only.

…ity) From the xhigh code review of #154. User-facing fixes only: - chat(collab): never inject the OWNER's GitHub token or workspace env credentials on an editor-collaborator turn — the agent runs in the owner's sandbox and a collaborator could echo them back via a sandbox command. Secrets stay server-side. (HIGH) - credentials: extend the reserved-name denylist with the proxy / TLS-trust / package-registry families (HTTP(S)_PROXY, NO_PROXY, NODE_EXTRA_CA_CERTS, SSL_CERT_FILE, REQUESTS/CURL_CA_BUNDLE, PIP_INDEX_URL, …) and git-config injection (GIT_CONFIG_*/GIT_PROXY_COMMAND/GIT_SSH, NPM_CONFIG_*), so a credential can't MITM or hijack the sandbox's outbound traffic. (HIGH) - usage: parse the unified rate-limit reset header robustly (int → RFC 3339 fallback) so a format variance can't drop the reset (and defeat the stale-window self-heal). (HIGH) - harness-share: clear sharedLocked when the LAST grant is revoked one-by-one (matching unshareHarness), so a later re-share doesn't start locked. - conversations: bound the workspace re-stamp scan with .take(8192) like fork(), so adopting/moving a very long conversation can't blow the per-transaction read/write limit. - chat-restore: pick the genuinely most-recent chat by lastMessageAt instead of the first match in the pinned-first list. - stream-bus: give followers a separate Redis connection pool so a crowd of viewers can't starve the latency-critical producer/tee path. - message-queue: flush the post-sync drain via an explicit signal instead of relying on the send-callback identity changing between turns. Tests: credential denylist families, last-grant lock-clear, deterministic post-sync drain. fastapi 361, convex 219, web 240 — all green.

Re-review (xhigh) of fa3177e surfaced gaps in the fixes themselves: - credentials: enforce the reserved-name denylist at RESOLVE/injection time too (resolve_workspace_env), not just at creation — a row stored before the denylist was expanded (e.g. HTTP_PROXY, GIT_CONFIG_*) would otherwise still be decrypted + injected. Shared is_reserved_env_name() helper. (MED) - usage: the RFC-3339 reset fallback parsed an offset-less timestamp as naive (local-tz), skewing resetsAt by the UTC offset; assume UTC when tzinfo is absent. - message-queue: armPendingSend no longer clobbers an already-armed send — it requeues the colliding message at the front so neither is dropped (handleSendNow / processQueuedAfterSync lacked drainQueueAfterTurn's guard). - harness-share: the last-grant lock-clear now keys on ACTIVE grants (isActiveGrant), like the rest of the module, so a future soft-revoke / expiry can't leave an inactive row that skips the unlock. - conversations: extract MESSAGE_SCAN_CAP constant for the four .take(8192) sites and document the partial-restamp bound (a >cap-message conversation adopts/moves fine but its oldest messages stay out of workspace search). - stream-bus tests: reset/close the new follower client globals in the fixtures (teardown symmetry after the producer/follower pool split). Tests added: reserved-name skipped at resolve; no-drop on arm collision. fastapi 362, convex 219, web 241 — all green.

DIodide and others added 30 commits June 19, 2026 19:12

Merge PR #113: live token fan-out via Redis Streams

3d763db

feat(stream): live token fan-out to all viewers via Redis Streams

Merge PR #115: Redis fan-out integration tests + CI

4ca9bf2

test(stream): Redis fan-out integration tests + smoke client + CI

Merge PR #116: wire live-follow into /workspaces

30827d7

fix(stream): wire live-follow into the /workspaces route

Merge PR #114: rewind + rewind-and-fork under every user message

eefb31e

Rewind (in-place truncate) + rewind-and-fork (branch) for normal Harness + Claude Code, via a server-side conversation reset. Adversarially reviewed twice; all critical/major findings fixed.

Merge PR #117: shared useRewind hook (/chat + /workspaces parity)

2513232

Extract shared useRewind hook so /chat and /workspaces stay at rewind/fork parity. No behavior change; 185 tests pass.

Merge PR #119: mid-message rewind at part boundaries (seams)

0eb2e99

Rewind / rewind-&-fork into the middle of an assistant message at part boundaries. Two adversarial review rounds; CI green.

Merge PR #121: persist faithful content on all OpenRouter save paths

c89e05d

content_from_parts as single source of truth in chat.py; self-reconciling _save_interrupted; done-event handshake preserved. CI green.

Merge PR #122: enforce content == contentFromParts(parts) in saveInte…

35945cb

…rruptedMessage Closes the last persistence path not covered by the content/parts invariant. Safe vs the streaming handshake. CI green.

Merge PR #125: mid-message rewind seam toggle setting + audit fixes

45b23eb

User setting to toggle seams (default on, gates only seams); 87-case audit + 3 action-handler fixes (agent-reset desync warning, busy-window guards). CI green.

DIodide and others added 25 commits June 21, 2026 16:43

docs: tighten Features to one-line bullets

bf67dcc

Cut the Features section to ~a third — name each feature, drop the specifics.

Merge pull request #144 from DIodide/docs/readme-license

231650e

docs: launch-ready README + GPLv3 LICENSE

Merge origin/staging into feat/harness-sharing (Skill Packs); resolve…

0c45bb8

… manage-tabs conflict

feat(sharing): share harnesses + Manage Sharing page (#146)

0526ef9

feat(sharing): share harnesses + Manage Sharing page

feat(usage): real Claude subscription usage bars (5h + weekly)

072d4ae

Merge pull request #150 from DIodide/feat/agent-usage-real

9e54b77

feat(usage): real Claude subscription usage bars (5h + weekly)

Merge pull request #152 from DIodide/fix/agent-usage-review

b7418e8

fix(usage): harden subscription-usage fetch (review follow-ups to #150)

Merge pull request #151 from DIodide/fix/harness-sharing-review-follo…

0420e7b

…wups fix(sharing): address xhigh code-review findings on harness sharing

DIodide self-assigned this Jun 23, 2026

DIodide temporarily deployed to staging June 23, 2026 13:24 — with GitHub Actions Inactive

DIodide deployed to staging June 23, 2026 14:15 — with GitHub Actions Active

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v1.0.0 — Live Sessions, Sharing & Skill Packs#154

Release v1.0.0 — Live Sessions, Sharing & Skill Packs#154
DIodide wants to merge 100 commits into
mainfrom
staging

DIodide commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant