[poc] Demo Persistent sessions/directories across harnesses/sandboxes#4813
[poc] Demo Persistent sessions/directories across harnesses/sandboxes#4813junaway wants to merge 7 commits into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
|
||
| async function sh(sbx, cmd, opts = {}) { | ||
| const r = await sbx.commands.run(cmd, { timeoutMs: 180000, ...opts }); | ||
| console.log(`$ ${cmd}\n exit=${r.exitCode}${r.stderr ? "\n stderr: " + r.stderr.slice(0, 400) : ""}`); |
| // GET /files?session_id=&path= -> proxy sandbox-agent fs listing (live view) | ||
| app.get("/files", async (req, res) => { | ||
| const { session_id: sid, path = "" } = req.query; | ||
| const abs = `/work/${sid}/${path}`.replace(/\/+$/, "") || `/work/${sid}`; |
There was a problem hiding this comment.
Pull request overview
Adds a self-contained “persistent sessions” demo that persists a per-session working directory in SeaweedFS (S3 API) via geesefs, and reuses/resumes coding-agent sessions across local and cloud sandboxes via a Node sidecar + FastAPI UI.
Changes:
- Introduces a Node sidecar that mounts per-session S3 prefixes into sandboxes and streams sandbox-agent ACP events as NDJSON.
- Adds a FastAPI service + static UI to create/resume sessions, persist transcripts to Postgres, and browse durable files from SeaweedFS.
- Adds docker-compose wiring plus multiple provider-specific smoke tests (E2B/Modal/Daytona) and an E2B template Dockerfile.
Reviewed changes
Copilot reviewed 33 out of 37 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| sessions/demo/sidecar/server.js | Sidecar HTTP API for mounting, running prompts, streaming events, and proxying sandbox filesystem listing. |
| sessions/demo/sidecar/provider-modal.js | Modal provider integration via a Python bridge subprocess. |
| sessions/demo/sidecar/provider-e2b.js | E2B provider integration (provision/reconnect, mount geesefs, start sandbox-agent). |
| sessions/demo/sidecar/package.json | Sidecar Node dependencies and start script. |
| sessions/demo/sidecar/modal_bridge.py | Modal sandbox provisioning/mounting/agent bootstrap logic (Python SDK). |
| sessions/demo/sidecar/Dockerfile | Sidecar container image (Node + Python + Modal SDK). |
| sessions/demo/seaweedfs/s3.json | SeaweedFS S3 identity configuration for the demo bucket. |
| sessions/demo/sandbox/entrypoint.sh | Sandbox container entrypoint (seed agent auth/trust files; start sandbox-agent). |
| sessions/demo/sandbox/Dockerfile | Local sandbox image (sandbox-agent + harness installs + geesefs + entrypoint). |
| sessions/demo/modal-smoketest/smoke.py | Modal FUSE/geesefs write round-trip smoke test. |
| sessions/demo/matrix_test.py | End-to-end harness×sandbox durability matrix smoke script. |
| sessions/demo/e2b-template/e2b.Dockerfile | E2B template image build (manual agent layout + geesefs). |
| sessions/demo/e2b-smoketest/smoke.mjs | E2B FUSE/geesefs mount+write smoke test. |
| sessions/demo/e2b-smoketest/package.json | E2B smoketest dependency manifest. |
| sessions/demo/e2b-smoketest/package-lock.json | E2B smoketest dependency lockfile. |
| sessions/demo/docker-compose.yml | Demo stack orchestration (postgres, seaweedfs, ngrok, sandbox, sidecar, fastapi). |
| sessions/demo/daytona-smoketest/write.mjs | Daytona geesefs write probe. |
| sessions/demo/daytona-smoketest/smoke.mjs | Daytona viability gate (egress + geesefs mount/write). |
| sessions/demo/daytona-smoketest/puttest.mjs | Daytona raw S3 PUT probe via tunnel (no geesefs). |
| sessions/demo/daytona-smoketest/package.json | Daytona smoketest dependency manifest. |
| sessions/demo/daytona-smoketest/l4probe.mjs | Daytona TCP-connect vs TLS/egress probe. |
| sessions/demo/daytona-smoketest/ipv4write.mjs | Daytona detached mount+write probe with IPv4 pinning. |
| sessions/demo/daytona-smoketest/ipv4.mjs | Daytona IPv4-only tunnel test + geesefs write. |
| sessions/demo/daytona-smoketest/final.mjs | Daytona consolidated curl+geesefs experiment script. |
| sessions/demo/daytona-smoketest/egress.mjs | Daytona basic egress probe to common hosts + tunnel. |
| sessions/demo/daytona-smoketest/curltest.mjs | Daytona GET/PUT curl probes to tunnel (including larger payload). |
| sessions/demo/daytona-smoketest/allowlist2.mjs | Daytona domainAllowList widening probe using example.com + tunnel. |
| sessions/demo/daytona-smoketest/allowlist.mjs | Daytona domainAllowList probe focusing on tunnel reachability. |
| sessions/demo/api/static/index.html | Static UI for session list, invoking runs, transcript view, and file browsing. |
| sessions/demo/api/requirements.txt | FastAPI service Python dependencies. |
| sessions/demo/api/main.py | FastAPI endpoints for invoke, session management, transcript persistence, and file reads. |
| sessions/demo/api/Dockerfile | FastAPI container build. |
| sessions/demo/api/db.py | Postgres schema + session/transcript persistence helpers. |
| sessions/demo/.env.example | Example environment variables for running the demo stack. |
Files not reviewed (1)
- sessions/demo/e2b-smoketest/package-lock.json: Generated file
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ) as resp: | ||
| async for line in resp.aiter_lines(): | ||
| line = line.strip() | ||
| if not line: | ||
| continue | ||
| evt = json.loads(line) |
| resp = await s3.list_objects_v2(Bucket=S3_BUCKET, Prefix=f"{sid}/") | ||
| keys = [{"Key": o["Key"]} for o in resp.get("Contents", [])] | ||
| if keys: | ||
| await s3.delete_objects(Bucket=S3_BUCKET, Delete={"Objects": keys}) |
| async def set_sandbox_id(sid: str, sandbox_id: str): | ||
| async with (await pool()).acquire() as con: | ||
| await con.execute( | ||
| "UPDATE sessions SET sandbox_id = $2 WHERE id = $1", sid, sandbox_id | ||
| ) |
| async def append_event(sid: str, event_index, sender, session_update, payload: dict): | ||
| # seq = current count of this session's events (dense, monotonic per session). | ||
| async with (await pool()).acquire() as con: | ||
| seq = await con.fetchval( | ||
| "SELECT count(*) FROM session_transcripts WHERE session_id = $1", sid | ||
| ) | ||
| await con.execute( | ||
| """INSERT INTO session_transcripts | ||
| (id, session_id, seq, event_index, sender, session_update, payload) | ||
| VALUES ($1, $2, $3, $4, $5, $6, $7) | ||
| ON CONFLICT (session_id, seq) DO NOTHING""", | ||
| uuid7(), | ||
| sid, | ||
| seq, | ||
| event_index, | ||
| sender, | ||
| session_update, | ||
| json.dumps(payload), | ||
| ) | ||
| await con.execute("UPDATE sessions SET updated_at = now() WHERE id = $1", sid) |
| app.get("/files", async (req, res) => { | ||
| const { session_id: sid, path = "" } = req.query; | ||
| const abs = `/work/${sid}/${path}`.replace(/\/+$/, "") || `/work/${sid}`; | ||
| try { | ||
| const r = await fetch(`${AGENT_URL}/v1/fs/entries?path=${encodeURIComponent(abs)}`); | ||
| res.status(r.status).type("application/json").send(await r.text()); | ||
| } catch (e) { | ||
| res.status(502).json({ error: String(e) }); | ||
| } | ||
| }); |
| // Auto-approve permission backstop (modes above usually skip prompts). | ||
| function autoApprove(session) { | ||
| session.onPermissionRequest?.((reqEvt) => { | ||
| const opts = reqEvt?.options || []; | ||
| const pick = | ||
| opts.find((o) => /allow.*always|always/i.test(o.kind || o.name || "")) || | ||
| opts.find((o) => /allow/i.test(o.kind || o.name || "")) || | ||
| opts.find((o) => !/reject|deny/i.test(o.kind || o.name || "")) || | ||
| opts[0]; | ||
| session.respondPermission?.(reqEvt.id, { optionId: pick?.optionId ?? pick?.id, allow: true }); | ||
| }); | ||
| } |
| // start the sandbox-agent server as root with agent creds in its env. | ||
| // background:true returns immediately without an exit code — do NOT run()-check it. | ||
| const envExports = Object.entries(agentEnv).map(([k, v]) => `${k}='${v}'`).join(" "); | ||
| sbx.commands | ||
| .run(`sudo sh -c "${envExports} exec sandbox-agent server --no-token --host 0.0.0.0 --port ${AGENT_PORT} >/tmp/sa.log 2>&1"`, | ||
| { background: true, timeoutMs: 0 }) |
| const d = document.createElement('div'); | ||
| d.className = 'sess' + (s.id === current ? ' active' : ''); |
| <div class="meta"> | ||
| <span class="badge b-provider" title="provider">${s.provider}</span>${modelShort(s) ? `<span class="badge b-model" title="model">${modelShort(s)}</span>` : ''}${s.reasoning && s.reasoning !== 'none' ? `<span class="badge b-reasoning" title="reasoning">${s.reasoning}</span>` : ''} | ||
| </div>`; | ||
| d.onclick = () => select(s.id); |
| RUN apt-get update && apt-get install -y --no-install-recommends python3 python3-pip \ | ||
| && pip3 install --break-system-packages --no-cache-dir modal \ | ||
| && rm -rf /var/lib/apt/lists/* |
…rollbars
Daytona's egress tier was lifted, so the geesefs-over-tunnel durability gate
(real S3 PUT that used to hang) now passes — host-verified durable in SeaweedFS.
Built it out as a first-class provider, mirroring the E2B path:
- sidecar/provider-daytona.js: provision from snapshot, mount geesefs via ngrok,
start the agent server, connect through the preview link (url + token header).
Long-lived server uses a process session (runAsync) — executeCommand always
blocks until exit, so a backgrounded & times out.
- sidecar/daytona_snapshot.js: builds the snapshot (template-equivalent) with
sandbox-agent + claude/codex/opencode/pi + geesefs baked in. Idempotent.
- Wire daytona into server.js /run + /kill, compose env, UI dropdown (first of
the non-local sandboxes), and matrix_test.py defaults.
- Matrix: daytona x {claude,codex,opencode,pi} all PASS (durable).
- Cleanup: drop 11 dead egress-investigation probe scripts, keep durability.mjs.
- UI: hide scrollbar chrome on the session list / main pane while keeping scroll.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…terface
Collapse the hand-rolled per-provider glue onto rivet sandbox-agent's
SandboxProvider abstraction. SandboxAgent.start({ sandbox }) now drives the full
create/reconnect/getUrl/destroy lifecycle; a single withGeesefs() wrapper adds the
reusable geesefs-over-ngrok durable cwd (mount demo:<sid> + seed auth + start the
credentialed agent server). makeProvider(name) returns the wrapped provider.
- sidecar/sandbox-provider.js: withGeesefs wrapper + custom bases for daytona, e2b,
docker. Custom (not SDK built-in) because: the built-in e2b() calls a removed
Sandbox.betaCreate; built-in docker() clobbers HostConfig (can't add FUSE caps
without losing the port mapping); and crucially all three built-ins start a
CREDENTIAL-LESS agent server, which our restart can't reliably replace (the exec
shell is a child of that server). Our bases start no server — withGeesefs starts
the only one, with creds. Verified the server env carries ANTHROPIC_API_KEY.
- Add `docker` provider: fresh container per session via the host daemon (docker.sock
mounted into the sidecar), /dev/fuse + SYS_ADMIN for geesefs, host.docker.internal
for the agent URL. New docker-image/Dockerfile (agenta-sandbox-agent:local, built
--platform linux/amd64 to match the x64 agent/geesefs binaries). UI dropdown after
local; matrix default.
- modal stays on the Python bridge (provider-modal.js) — the Node modal SDK needs a
separately-baked Modal image. local stays the persistent compose container.
- server.js /run + /kill route daytona/e2b/docker through makeProvider; modal + local
unchanged.
- Delete the now-dead provider-daytona.js / provider-e2b.js; fix Dockerfile COPY;
drop the unused Node modal dep; bump @e2b/code-interpreter to 2.x.
Matrix (durable agent write, host-verified): local/docker/daytona/modal/e2b all PASS.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docker containers are AutoRemove, so a session's container is gone once it stops; SandboxAgent.start() doesn't auto-recreate — it 404s on the dead container id when resuming (also possible for a GC'd cloud sandbox). On resume failure, fall back to starting a fresh sandbox, which remounts the same demo:<sid> prefix so the durable cwd in SeaweedFS (and all prior files) is preserved. Verified: write marker in run 1 (container auto-removed), resume same session in run 2 (fresh container) reads the marker back. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docker containers run `sleep infinity` + the agent server, so they never stop on their own and AutoRemove never fires — every session leaked a live container. docker is fresh-per-turn, so tear the container down after the turn: dispose() the agent connection cleanly FIRST (killing mid-stream caused "other side closed" -> HTTP 500 on the turn), then destroy the container, then clear the persisted sandbox_id. The cwd is durable in SeaweedFS (--fsync-on-close), so resume recreates + remounts. Verified: a docker turn leaves 0 containers; write marker -> resume same session into a fresh container -> reads the marker back. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| except httpx.HTTPError as e: | ||
| note = f"sidecar kill error (cleared anyway): {e}" | ||
| await db.set_sandbox_id(sid, None) | ||
| return {"killed": row["sandbox_id"], "note": note} |
| // real prompt rides in data.inputs.messages[0], dims in data.parameters. | ||
| function envelope({ sid, prompt, force }) { | ||
| const body = { force: !!force }; | ||
| if (sid) body.session_id = sid; |
| await fetch(`${API_URL}/sessions/${sid}/sandbox-id`, { | ||
| method: "PUT", | ||
| headers: { "content-type": "application/json" }, | ||
| body: JSON.stringify({ sandbox_id: sandboxId }), | ||
| }); |
| } | ||
| res.end(); | ||
| } catch (err) { | ||
| console.error("[/run] error", err); |
No description provided.