Skip to content

[poc] Demo Persistent sessions/directories across harnesses/sandboxes#4813

Draft
junaway wants to merge 7 commits into
mainfrom
poc/persistent-sessions
Draft

[poc] Demo Persistent sessions/directories across harnesses/sandboxes#4813
junaway wants to merge 7 commits into
mainfrom
poc/persistent-sessions

Conversation

@junaway

@junaway junaway commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

No description provided.

Copilot AI review requested due to automatic review settings June 24, 2026 07:42
@vercel

vercel Bot commented Jun 24, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 25, 2026 8:14am

Request Review

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 9163debd-9e1a-4505-a9e5-3590813eb37e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch poc/persistent-sessions

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Comment thread sessions/demo/daytona-smoketest/smoke.mjs Fixed

async function sh(sbx, cmd, opts = {}) {
const r = await sbx.commands.run(cmd, { timeoutMs: 180000, ...opts });
console.log(`$ ${cmd}\n exit=${r.exitCode}${r.stderr ? "\n stderr: " + r.stderr.slice(0, 400) : ""}`);
Comment thread sessions/demo/sidecar/server.js Fixed
// GET /files?session_id=&path= -> proxy sandbox-agent fs listing (live view)
app.get("/files", async (req, res) => {
const { session_id: sid, path = "" } = req.query;
const abs = `/work/${sid}/${path}`.replace(/\/+$/, "") || `/work/${sid}`;

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a self-contained “persistent sessions” demo that persists a per-session working directory in SeaweedFS (S3 API) via geesefs, and reuses/resumes coding-agent sessions across local and cloud sandboxes via a Node sidecar + FastAPI UI.

Changes:

  • Introduces a Node sidecar that mounts per-session S3 prefixes into sandboxes and streams sandbox-agent ACP events as NDJSON.
  • Adds a FastAPI service + static UI to create/resume sessions, persist transcripts to Postgres, and browse durable files from SeaweedFS.
  • Adds docker-compose wiring plus multiple provider-specific smoke tests (E2B/Modal/Daytona) and an E2B template Dockerfile.

Reviewed changes

Copilot reviewed 33 out of 37 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
sessions/demo/sidecar/server.js Sidecar HTTP API for mounting, running prompts, streaming events, and proxying sandbox filesystem listing.
sessions/demo/sidecar/provider-modal.js Modal provider integration via a Python bridge subprocess.
sessions/demo/sidecar/provider-e2b.js E2B provider integration (provision/reconnect, mount geesefs, start sandbox-agent).
sessions/demo/sidecar/package.json Sidecar Node dependencies and start script.
sessions/demo/sidecar/modal_bridge.py Modal sandbox provisioning/mounting/agent bootstrap logic (Python SDK).
sessions/demo/sidecar/Dockerfile Sidecar container image (Node + Python + Modal SDK).
sessions/demo/seaweedfs/s3.json SeaweedFS S3 identity configuration for the demo bucket.
sessions/demo/sandbox/entrypoint.sh Sandbox container entrypoint (seed agent auth/trust files; start sandbox-agent).
sessions/demo/sandbox/Dockerfile Local sandbox image (sandbox-agent + harness installs + geesefs + entrypoint).
sessions/demo/modal-smoketest/smoke.py Modal FUSE/geesefs write round-trip smoke test.
sessions/demo/matrix_test.py End-to-end harness×sandbox durability matrix smoke script.
sessions/demo/e2b-template/e2b.Dockerfile E2B template image build (manual agent layout + geesefs).
sessions/demo/e2b-smoketest/smoke.mjs E2B FUSE/geesefs mount+write smoke test.
sessions/demo/e2b-smoketest/package.json E2B smoketest dependency manifest.
sessions/demo/e2b-smoketest/package-lock.json E2B smoketest dependency lockfile.
sessions/demo/docker-compose.yml Demo stack orchestration (postgres, seaweedfs, ngrok, sandbox, sidecar, fastapi).
sessions/demo/daytona-smoketest/write.mjs Daytona geesefs write probe.
sessions/demo/daytona-smoketest/smoke.mjs Daytona viability gate (egress + geesefs mount/write).
sessions/demo/daytona-smoketest/puttest.mjs Daytona raw S3 PUT probe via tunnel (no geesefs).
sessions/demo/daytona-smoketest/package.json Daytona smoketest dependency manifest.
sessions/demo/daytona-smoketest/l4probe.mjs Daytona TCP-connect vs TLS/egress probe.
sessions/demo/daytona-smoketest/ipv4write.mjs Daytona detached mount+write probe with IPv4 pinning.
sessions/demo/daytona-smoketest/ipv4.mjs Daytona IPv4-only tunnel test + geesefs write.
sessions/demo/daytona-smoketest/final.mjs Daytona consolidated curl+geesefs experiment script.
sessions/demo/daytona-smoketest/egress.mjs Daytona basic egress probe to common hosts + tunnel.
sessions/demo/daytona-smoketest/curltest.mjs Daytona GET/PUT curl probes to tunnel (including larger payload).
sessions/demo/daytona-smoketest/allowlist2.mjs Daytona domainAllowList widening probe using example.com + tunnel.
sessions/demo/daytona-smoketest/allowlist.mjs Daytona domainAllowList probe focusing on tunnel reachability.
sessions/demo/api/static/index.html Static UI for session list, invoking runs, transcript view, and file browsing.
sessions/demo/api/requirements.txt FastAPI service Python dependencies.
sessions/demo/api/main.py FastAPI endpoints for invoke, session management, transcript persistence, and file reads.
sessions/demo/api/Dockerfile FastAPI container build.
sessions/demo/api/db.py Postgres schema + session/transcript persistence helpers.
sessions/demo/.env.example Example environment variables for running the demo stack.
Files not reviewed (1)
  • sessions/demo/e2b-smoketest/package-lock.json: Generated file

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread sessions/demo/api/main.py Outdated
Comment on lines +85 to +90
) as resp:
async for line in resp.aiter_lines():
line = line.strip()
if not line:
continue
evt = json.loads(line)
Comment thread sessions/demo/api/main.py
Comment on lines +168 to +171
resp = await s3.list_objects_v2(Bucket=S3_BUCKET, Prefix=f"{sid}/")
keys = [{"Key": o["Key"]} for o in resp.get("Contents", [])]
if keys:
await s3.delete_objects(Bucket=S3_BUCKET, Delete={"Objects": keys})
Comment thread sessions/demo/api/db.py
Comment on lines +106 to +110
async def set_sandbox_id(sid: str, sandbox_id: str):
async with (await pool()).acquire() as con:
await con.execute(
"UPDATE sessions SET sandbox_id = $2 WHERE id = $1", sid, sandbox_id
)
Comment thread sessions/demo/api/db.py
Comment on lines +123 to +142
async def append_event(sid: str, event_index, sender, session_update, payload: dict):
# seq = current count of this session's events (dense, monotonic per session).
async with (await pool()).acquire() as con:
seq = await con.fetchval(
"SELECT count(*) FROM session_transcripts WHERE session_id = $1", sid
)
await con.execute(
"""INSERT INTO session_transcripts
(id, session_id, seq, event_index, sender, session_update, payload)
VALUES ($1, $2, $3, $4, $5, $6, $7)
ON CONFLICT (session_id, seq) DO NOTHING""",
uuid7(),
sid,
seq,
event_index,
sender,
session_update,
json.dumps(payload),
)
await con.execute("UPDATE sessions SET updated_at = now() WHERE id = $1", sid)
Comment on lines +230 to +239
app.get("/files", async (req, res) => {
const { session_id: sid, path = "" } = req.query;
const abs = `/work/${sid}/${path}`.replace(/\/+$/, "") || `/work/${sid}`;
try {
const r = await fetch(`${AGENT_URL}/v1/fs/entries?path=${encodeURIComponent(abs)}`);
res.status(r.status).type("application/json").send(await r.text());
} catch (e) {
res.status(502).json({ error: String(e) });
}
});
Comment on lines +124 to +135
// Auto-approve permission backstop (modes above usually skip prompts).
function autoApprove(session) {
session.onPermissionRequest?.((reqEvt) => {
const opts = reqEvt?.options || [];
const pick =
opts.find((o) => /allow.*always|always/i.test(o.kind || o.name || "")) ||
opts.find((o) => /allow/i.test(o.kind || o.name || "")) ||
opts.find((o) => !/reject|deny/i.test(o.kind || o.name || "")) ||
opts[0];
session.respondPermission?.(reqEvt.id, { optionId: pick?.optionId ?? pick?.id, allow: true });
});
}
Comment thread sessions/demo/sidecar/provider-e2b.js Outdated
Comment on lines +60 to +65
// start the sandbox-agent server as root with agent creds in its env.
// background:true returns immediately without an exit code — do NOT run()-check it.
const envExports = Object.entries(agentEnv).map(([k, v]) => `${k}='${v}'`).join(" ");
sbx.commands
.run(`sudo sh -c "${envExports} exec sandbox-agent server --no-token --host 0.0.0.0 --port ${AGENT_PORT} >/tmp/sa.log 2>&1"`,
{ background: true, timeoutMs: 0 })
Comment on lines +210 to +211
const d = document.createElement('div');
d.className = 'sess' + (s.id === current ? ' active' : '');
<div class="meta">
<span class="badge b-provider" title="provider">${s.provider}</span>${modelShort(s) ? `<span class="badge b-model" title="model">${modelShort(s)}</span>` : ''}${s.reasoning && s.reasoning !== 'none' ? `<span class="badge b-reasoning" title="reasoning">${s.reasoning}</span>` : ''}
</div>`;
d.onclick = () => select(s.id);
Comment on lines +4 to +6
RUN apt-get update && apt-get install -y --no-install-recommends python3 python3-pip \
&& pip3 install --break-system-packages --no-cache-dir modal \
&& rm -rf /var/lib/apt/lists/*
…rollbars

Daytona's egress tier was lifted, so the geesefs-over-tunnel durability gate
(real S3 PUT that used to hang) now passes — host-verified durable in SeaweedFS.
Built it out as a first-class provider, mirroring the E2B path:

- sidecar/provider-daytona.js: provision from snapshot, mount geesefs via ngrok,
  start the agent server, connect through the preview link (url + token header).
  Long-lived server uses a process session (runAsync) — executeCommand always
  blocks until exit, so a backgrounded & times out.
- sidecar/daytona_snapshot.js: builds the snapshot (template-equivalent) with
  sandbox-agent + claude/codex/opencode/pi + geesefs baked in. Idempotent.
- Wire daytona into server.js /run + /kill, compose env, UI dropdown (first of
  the non-local sandboxes), and matrix_test.py defaults.
- Matrix: daytona x {claude,codex,opencode,pi} all PASS (durable).
- Cleanup: drop 11 dead egress-investigation probe scripts, keep durability.mjs.
- UI: hide scrollbar chrome on the session list / main pane while keeping scroll.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…terface

Collapse the hand-rolled per-provider glue onto rivet sandbox-agent's
SandboxProvider abstraction. SandboxAgent.start({ sandbox }) now drives the full
create/reconnect/getUrl/destroy lifecycle; a single withGeesefs() wrapper adds the
reusable geesefs-over-ngrok durable cwd (mount demo:<sid> + seed auth + start the
credentialed agent server). makeProvider(name) returns the wrapped provider.

- sidecar/sandbox-provider.js: withGeesefs wrapper + custom bases for daytona, e2b,
  docker. Custom (not SDK built-in) because: the built-in e2b() calls a removed
  Sandbox.betaCreate; built-in docker() clobbers HostConfig (can't add FUSE caps
  without losing the port mapping); and crucially all three built-ins start a
  CREDENTIAL-LESS agent server, which our restart can't reliably replace (the exec
  shell is a child of that server). Our bases start no server — withGeesefs starts
  the only one, with creds. Verified the server env carries ANTHROPIC_API_KEY.
- Add `docker` provider: fresh container per session via the host daemon (docker.sock
  mounted into the sidecar), /dev/fuse + SYS_ADMIN for geesefs, host.docker.internal
  for the agent URL. New docker-image/Dockerfile (agenta-sandbox-agent:local, built
  --platform linux/amd64 to match the x64 agent/geesefs binaries). UI dropdown after
  local; matrix default.
- modal stays on the Python bridge (provider-modal.js) — the Node modal SDK needs a
  separately-baked Modal image. local stays the persistent compose container.
- server.js /run + /kill route daytona/e2b/docker through makeProvider; modal + local
  unchanged.
- Delete the now-dead provider-daytona.js / provider-e2b.js; fix Dockerfile COPY;
  drop the unused Node modal dep; bump @e2b/code-interpreter to 2.x.

Matrix (durable agent write, host-verified): local/docker/daytona/modal/e2b all PASS.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 24, 2026 10:25

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

docker containers are AutoRemove, so a session's container is gone once it stops;
SandboxAgent.start() doesn't auto-recreate — it 404s on the dead container id when
resuming (also possible for a GC'd cloud sandbox). On resume failure, fall back to
starting a fresh sandbox, which remounts the same demo:<sid> prefix so the durable
cwd in SeaweedFS (and all prior files) is preserved.

Verified: write marker in run 1 (container auto-removed), resume same session in run 2
(fresh container) reads the marker back.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docker containers run `sleep infinity` + the agent server, so they never stop on
their own and AutoRemove never fires — every session leaked a live container. docker
is fresh-per-turn, so tear the container down after the turn: dispose() the agent
connection cleanly FIRST (killing mid-stream caused "other side closed" -> HTTP 500 on
the turn), then destroy the container, then clear the persisted sandbox_id. The cwd is
durable in SeaweedFS (--fsync-on-close), so resume recreates + remounts.

Verified: a docker turn leaves 0 containers; write marker -> resume same session into a
fresh container -> reads the marker back.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 24, 2026 10:37

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

Comment thread sessions/demo/api/main.py
except httpx.HTTPError as e:
note = f"sidecar kill error (cleared anyway): {e}"
await db.set_sandbox_id(sid, None)
return {"killed": row["sandbox_id"], "note": note}
// real prompt rides in data.inputs.messages[0], dims in data.parameters.
function envelope({ sid, prompt, force }) {
const body = { force: !!force };
if (sid) body.session_id = sid;
Comment on lines +86 to +90
await fetch(`${API_URL}/sessions/${sid}/sandbox-id`, {
method: "PUT",
headers: { "content-type": "application/json" },
body: JSON.stringify({ sandbox_id: sandboxId }),
});
}
res.end();
} catch (err) {
console.error("[/run] error", err);
Copilot AI review requested due to automatic review settings June 25, 2026 08:13

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants