diff --git a/docs/design/agent-workflows/documentation/running-the-agent.md b/docs/design/agent-workflows/documentation/running-the-agent.md index 48c69e1ad0..2fa126b5c8 100644 --- a/docs/design/agent-workflows/documentation/running-the-agent.md +++ b/docs/design/agent-workflows/documentation/running-the-agent.md @@ -162,11 +162,19 @@ env file points the chat slice at These are the agent-relevant variables. The example file lists them commented out (`hosting/docker-compose/ee/env.ee.dev.example`, lines 119 onward). -- `AGENTA_AGENT_RUNNER_URL`. Where the Python service finds the runner. Default - `http://sandbox-agent:8765`. When unset, the Python service spawns the runner CLI locally - instead (see `runner_url` and `select_backend` in `services/oss/src/agent/`). +- `AGENTA_AGENT_RUNNER_URL`. Where the Python service finds the runner when no per-run `uri` + override is set. Default `http://sandbox-agent:8765`. When unset, the Python service spawns the + runner CLI locally instead (see `runner_url` and `select_backend` in `services/oss/src/agent/`). +- `AGENTA_AGENT_RUNNER_URI_ALLOWLIST`. Comma-separated list of trusted sidecar origins + (`scheme://host[:port]`). The agent config's optional `uri` field is honored only when its + origin is on this list; **default empty means every override is rejected** (the feature ships + off, only `AGENTA_AGENT_RUNNER_URL` / the local CLI work). The gate exists because the service + ships resolved provider keys and bearer tokens to whatever address it picks, so a + caller-supplied address is an SSRF / secret-exfiltration risk. See `validate_runner_uri` in + `services/oss/src/agent/config.py`. - `AGENTA_AGENT_ENABLE_MCP`. Gates MCP server resolution. Default `false`. -- `SANDBOX_AGENT_PROVIDER`. `local` or `daytona`. Default `local`. +- `SANDBOX_AGENT_PROVIDER`. `local` or `daytona`. Default `local`. Each sidecar picks its own + sandbox provider from this env; there is no per-run sandbox selector on the wire. - `SANDBOX_AGENT_DAYTONA_API_KEY`, `_API_URL`, `_TARGET`, `_SNAPSHOT`, `_IMAGE`, `_INSTALL_PI`. Daytona credentials the runner reads for the `daytona` sandbox provider. diff --git a/docs/design/agent-workflows/interfaces/cross-service/service-to-agent-runner.md b/docs/design/agent-workflows/interfaces/cross-service/service-to-agent-runner.md index d878142ce5..278646fed1 100644 --- a/docs/design/agent-workflows/interfaces/cross-service/service-to-agent-runner.md +++ b/docs/design/agent-workflows/interfaces/cross-service/service-to-agent-runner.md @@ -31,7 +31,7 @@ group by job: { // agent + placement "harness": "pi_core", // "pi_core" | "pi_agenta" | "claude"; selects the ACP agent (no engine selector) - "sandbox": "local", // "local" | "daytona" + "sandbox": "local", // constant default; the sidecar picks its own sandbox provider from its env (no per-run selector). The service routes to the sidecar by the config's `uri` instead. "sessionId": "sess_ab12...", // external id; cold runtime still gets full history // instructions diff --git a/docs/design/agent-workflows/interfaces/in-service/agent-service-handler.md b/docs/design/agent-workflows/interfaces/in-service/agent-service-handler.md index f42f121d00..7ee43eb196 100644 --- a/docs/design/agent-workflows/interfaces/in-service/agent-service-handler.md +++ b/docs/design/agent-workflows/interfaces/in-service/agent-service-handler.md @@ -11,7 +11,7 @@ the order it runs in is itself a contract. The handler (`_agent` in `app.py`) takes the workflow envelope's pieces: - `parameters`: carries the agent config under `agent`. The run-selection fields (`harness`, - `sandbox`, `permission_policy`) live on that same `agent` object. + `uri`, `permission_policy`) live on that same `agent` object. - `messages` or `inputs.messages`: the turn history (it checks `messages` first). - `stream`: batch versus streaming. - `session_id`: the external conversation id. @@ -19,7 +19,7 @@ The handler (`_agent` in `app.py`) takes the workflow envelope's pieces: ## What it does, in order 1. Parse the config: `AgentConfig.from_params(params, defaults=...)`. One parse covers - everything, including the run-selection fields (`harness`, `sandbox`, `permission_policy`). + everything, including the run-selection fields (`harness`, `uri`, `permission_policy`). 2. Convert the request messages to neutral `Message[]`. 3. Resolve tools into builtin names, runnable specs, and a tool callback. 4. Resolve MCP servers. @@ -28,7 +28,11 @@ The handler (`_agent` in `app.py`) takes the workflow envelope's pieces: connections degrade tolerantly to an empty `env` rather than failing the run. 6. Build one `SessionConfig` carrying all of the above plus trace context and session id. 7. Select the backend (`SandboxAgentBackend`) and make the harness, which validates that the - harness is supported. + harness is supported. `select_backend` routes by the config's `uri`: routing precedence is + the config's `uri` (validated against the server-side allowlist) -> `AGENTA_AGENT_RUNNER_URL` + -> the local runner CLI. A set-but-disallowed `uri` fails loud (no silent fallback). The + sidecar at the resolved address is configured local-or-Daytona by its own env, so the + service sends a constant `sandbox` default on the wire rather than a per-run selector. 8. Run: stream Vercel parts, or await one batch turn that returns `{"role": "assistant", "content": result.output}`. 9. Record usage. @@ -64,3 +68,8 @@ the instrumented handler and merges the registered interface (the passed `schema The tolerant default path is deliberate. Reordering can turn a clean reject into a leak or a hard failure. - **Batch versus streaming.** Two execution paths return two shapes. Keep them in sync. +- **Sidecar routing and the allowlist.** A caller-supplied `uri` controls where the service + ships resolved secrets and bearer tokens, so it is honored only when its origin is on + `AGENTA_AGENT_RUNNER_URI_ALLOWLIST` (default empty = every override rejected, feature off). + Loosening the gate is a security change. `resolve_runner_url` / `validate_runner_uri` live in + `services/oss/src/agent/config.py`. diff --git a/docs/design/agent-workflows/interfaces/public-edge/agent-config-schema.md b/docs/design/agent-workflows/interfaces/public-edge/agent-config-schema.md index c3037fc1f3..ba0c33fdf9 100644 --- a/docs/design/agent-workflows/interfaces/public-edge/agent-config-schema.md +++ b/docs/design/agent-workflows/interfaces/public-edge/agent-config-schema.md @@ -24,15 +24,17 @@ The fields and the full schema follow. | `tools` | `ToolConfig[]` | `[]` | Runnable tools: `builtin`, `gateway`, `code`, or `client`. See [Tool models and resolution](../in-service/tool-models-and-resolution.md). | | `mcp_servers` | `MCPServerConfig[]` | `[]` | Declared MCP servers; secret env resolved from the vault at run time. See [MCP models and resolution](../in-service/mcp-models-and-resolution.md). | | `harness` | `"pi_core" \| "claude" \| "pi_agenta"` (see slug+name note) | `"pi_core"` | The coding agent to drive. `pi_core` and `pi_agenta` both drive the `pi` ACP agent; `pi_agenta` adds Agenta's forced skills, prompt, and policy. | -| `sandbox` | `"local" \| "daytona"` | `"local"` | Where it runs. | +| `uri` | string \| null | `null` (key omitted) | Optional sidecar (agent runner) address that routes this run. When set, the server routes there (gated by `AGENTA_AGENT_RUNNER_URI_ALLOWLIST`, default empty); when unset, it falls back to its env-var runner resolution. The sidecar is configured local-or-Daytona by its own env, not per-run. Operator/testing concern, not a normal authoring field, so the playground does not surface a control for it. See [Agent service handler](../in-service/agent-service-handler.md). | | `permission_policy` | `"auto" \| "deny"` | `"auto"` | How a gating harness (Claude Code) handles tool-use prompts in a headless run. | | `sandbox_permission` | `SandboxPermission \| null` | `null` (form pre-fills one) | The declared network and filesystem boundary. See [Sandbox permission](../in-service/sandbox-permission.md). | | `skills` | `(SkillConfig \| EmbedRef)[]` | one embedded default skill | Inline SKILL.md packages, or `@ag.embed` references the backend inlines before the runner sees them. | -Note that `harness`, `sandbox`, and `permission_policy` are the run-selection fields. They +Note that `harness`, `uri`, and `permission_policy` are the run-selection fields. They live on `AgentConfig` itself, under `data.parameters.agent`, and the handler reads them in the one `AgentConfig.from_params(...)` parse along with the rest of the config. There is one agent -config, not a config plus a separate selection object. +config, not a config plus a separate selection object. `uri` replaced the old `sandbox` +selector: the sidecar address drives routing, and each sidecar picks its own sandbox provider +(local or Daytona) from its own environment, so there is no per-run sandbox field anymore. ### Harness as a slug + display name @@ -75,7 +77,6 @@ every field in its default state: "tools": [], "mcp_servers": [], "harness": "pi_core", - "sandbox": "local", "permission_policy": "auto", "sandbox_permission": { "network": { "mode": "on", "allowlist": [] }, diff --git a/docs/design/agent-workflows/projects/sidecar-uri-config/README.md b/docs/design/agent-workflows/projects/sidecar-uri-config/README.md new file mode 100644 index 0000000000..8bde5e113b --- /dev/null +++ b/docs/design/agent-workflows/projects/sidecar-uri-config/README.md @@ -0,0 +1,54 @@ +# Sidecar URI in the agent config + +Index for the workspace that adds an **optional `uri`** to the agent config, pointing at the +address of the sidecar (the agent runner). When set, the service routes the `/run` request to +that address (gated by a server-side allowlist); when unset, it falls back to today's +environment-variable resolution (`AGENTA_AGENT_RUNNER_URL`, else the local runner directory). + +Spun out of PR [#4821](https://github.com/Agenta-AI/agenta/pull/4821) review comment +[3469613625](https://github.com/Agenta-AI/agenta/pull/4821#discussion_r3469613625): + +> *"instead we should have an optional uri that points to sidecar and provide an address of +> the thing (the sandbox should probably use this uri to determine where to route the +> request). if the uri is not set then we use the environment variables"* + +**Status: IMPLEMENTED.** The reviewer decided (D7) that `uri` **replaces** the `sandbox` field +outright — the sidecar address drives routing, and each sidecar picks its own sandbox provider +(local or Daytona) from its own environment. POC / pre-production, so there is **no back-compat +constraint** and no migration: the env-var path stays as the fallback purely because it is the +sensible default. Built on [#4840](https://github.com/Agenta-AI/agenta/pull/4840) +(config-structure cleanup), which retired `RunSelection` and moved the run-selection fields onto +`AgentConfig`. + +## Files + +- [context.md](context.md) — why this exists, the reviewer's ask, goals, non-goals. +- [research.md](research.md) — where routing is decided today, what `RunSelection` vs + `AgentConfig` carry, how `select_backend` / `runner_url()` / `runner_dir()` work, and the + exact seam a `uri` would slot into. +- [plan.md](plan.md) — the proposed change: the field, where it lives on the wire and in the + schema, how routing prefers it, the env fallback, tests, and rollout. +- [security.md](security.md) — the one real concern: a caller-supplied sidecar address. Threat + model, why it is dangerous, and the proposed restriction (server-side allowlist, default-off), + cross-linked to the sidecar-trust project. +- [status.md](status.md) — current state, decisions, and the open questions for the reviewer. + +## What landed + +Routing is decided in exactly one place — `select_backend(agent_config)` in +`services/oss/src/agent/app.py`, which now builds +`SandboxAgentBackend(url=resolve_runner_url(agent_config.uri), ...)`. `uri` is a **run-selection** +field on `AgentConfig` (it says *where* a run goes); it **replaced** `sandbox`. Precedence: +`uri` (validated) -> `AGENTA_AGENT_RUNNER_URL` -> the local CLI. It is **not a new `/run` wire +field**: it never crosses the service→runner boundary; it only decides which runner the service +opens that boundary to. The single load-bearing decision is the security one: a client-supplied +sidecar address is an SSRF / secret-exfiltration risk because the service ships resolved +provider keys and bearer tokens in every `/run` body, so a caller-supplied address is +**restricted server-side** by `AGENTA_AGENT_RUNNER_URI_ALLOWLIST` (default empty = feature off); +a disallowed `uri` fails loud (no silent fallback) — see [security.md](security.md). + +The old per-run `sandbox` selector is gone; each sidecar picks its sandbox provider from its own +env (`SANDBOX_AGENT_PROVIDER`). The wire still carries a constant `sandbox: "local"` so the +unchanged runner defaults correctly; **removing the wire-level `sandbox` field is a deferred +runner-branch follow-up** (it would force a `protocol.ts` / golden change, out of scope for this +config/service/FE slice). diff --git a/docs/design/agent-workflows/projects/sidecar-uri-config/context.md b/docs/design/agent-workflows/projects/sidecar-uri-config/context.md new file mode 100644 index 0000000000..1c28b70dbf --- /dev/null +++ b/docs/design/agent-workflows/projects/sidecar-uri-config/context.md @@ -0,0 +1,72 @@ +# Context + +## Why this exists + +Today the agent service decides which runner (sidecar) to call from environment variables +alone. The agent service (`services/oss/src/agent/`) is the control plane; the Node runner +sidecar (`services/agent/`) is the execution plane. The control plane reaches the execution +plane at `POST /run`. Which sidecar it reaches is fixed at deploy time by +`AGENTA_AGENT_RUNNER_URL` (HTTP) or, when that is unset, a local TypeScript runner spawned from +`AGENTA_AGENT_RUNNER_DIR`. + +The reviewer on PR #4821 asked for a per-run override: an **optional `uri`** in the agent +config that names the sidecar address, and routing that **prefers** that address when present +and **falls back** to the env vars when it is absent. + +The original review thread sits on the agent-config schema doc +(`docs/design/agent-workflows/interfaces/public-edge/agent-config-schema.md`, line 27), which is +why the field is being designed as part of the config surface rather than, say, a deployment +knob. + +## The reviewer's ask, verbatim + +> instead we should have an optional uri that points to sidecar and provide an address of the +> thing (the sandbox should probably use this uri to determine where to route the request). if +> the uri is not set then we use the environment variables + +Two parts: + +1. **An optional `uri`** in the config that gives the sidecar address. +2. **Routing uses it** to decide where `/run` goes; **unset → env-var resolution** as today. + +## What "the config" means here, precisely + +The agent surface has two distinct config models (see the agent-config-schema interface doc): + +- **`AgentConfig`** (neutral) — *what the agent is*: instructions, model, tools, MCP servers, + skills, harness options. It is harness-agnostic and is the thing that would, one day, be + stored as a versioned artifact. +- **`RunSelection`** — *where and how a run executes*: `harness`, `sandbox`, + `permission_policy`. The handler reads it from the same `parameters` object but keeps it + deliberately out of the neutral `AgentConfig`. + +A sidecar address is a **routing** fact (where the run goes), exactly like `sandbox`. So this +design places `uri` in `RunSelection`, not in `AgentConfig`. The field is surfaced on the +playground-facing `AgentConfigSchema`, which already carries the run-selection trio +(`harness`/`sandbox`/`permission_policy`) for editing. See [research.md](research.md) for the +evidence and [plan.md](plan.md) for the exact placement. + +## Goals + +- Let a run name its sidecar address and have routing honor it. +- Keep the env-var path as the fallback when `uri` is unset (the default, unchanged). +- Make the change in the single routing seam (`select_backend`) without touching the + service→runner `/run` wire contract or the golden fixtures. +- Decide and document the security posture of a caller-supplied address before any code lands. + +## Non-goals + +- **No `/run` wire field.** `uri` decides which runner the service opens the boundary to; it + does not cross that boundary. The runner never receives it. The golden wire fixtures stay + byte-identical. +- **No back-compat machinery.** Pre-production; the env-var fallback is kept because it is the + right default, not for compatibility. +- **No new transport.** Reuses the existing HTTP delivery (`deliver_http` / + `deliver_http_stream`). A `uri` only changes the URL, not how bytes move. +- **No sidecar trust/transport hardening here.** mTLS, a `/run` shared token, scoped tokens — + those belong to the [sidecar-trust-and-sandbox-enforcement](../sidecar-trust-and-sandbox-enforcement/README.md) + project. This project only adds the *address selection* and the *restriction* that a + caller-supplied address requires (see [security.md](security.md)). +- **No deployment-shape changes.** The env contract, Compose/Helm/Railway wiring, and the + `sandbox-agent` naming are the [sidecar-deployment-proposal](../sidecar-deployment-proposal/README.md) + project's scope. This `uri` is the per-run override layered *on top* of that default contract. diff --git a/docs/design/agent-workflows/projects/sidecar-uri-config/plan.md b/docs/design/agent-workflows/projects/sidecar-uri-config/plan.md new file mode 100644 index 0000000000..054613e4b1 --- /dev/null +++ b/docs/design/agent-workflows/projects/sidecar-uri-config/plan.md @@ -0,0 +1,143 @@ +# Plan + +The change is small and localized: one new optional field on `RunSelection` (and the schema it +is edited through), one preference inside `select_backend`, a server-side restriction on the +value, and tests. No `/run` wire change, no golden-fixture change. + +## 1. The field + +Add an optional `uri` to `RunSelection` in `sdks/python/agenta/sdk/agents/dtos.py`: + +```python +class RunSelection(BaseModel): + harness: str = "pi_core" + sandbox: str = "local" + permission_policy: PermissionPolicy = "auto" + uri: Optional[str] = None # sidecar (runner) address; unset → env-var fallback + + @classmethod + def from_params(cls, params, *, default_harness="pi_core", default_sandbox="local"): + agent = params.get("agent") + source = agent if isinstance(agent, dict) else params + return cls( + harness=str(source.get("harness") or default_harness).lower(), + sandbox=str(source.get("sandbox") or default_sandbox).lower(), + permission_policy=str(source.get("permission_policy") or "auto").lower(), + uri=_clean_uri(source.get("uri")), + ) +``` + +`_clean_uri` trims and returns `None` for empty/blank, mirroring `runner_url()`'s +trim-or-`None` behavior so an empty string never counts as "set". + +**Why `RunSelection`, not `AgentConfig`:** the address is *where a run goes*, the same category +as `sandbox`. The neutral `AgentConfig` is *what the agent is* and stays address-free. This +matches the existing split (see research §3). + +## 2. The schema (playground editor) + +Add `uri` to `AgentConfigSchema` in `sdks/python/agenta/sdk/utils/types.py`, next to the +run-selection trio, **optional, default unset**, and likely marked advanced/hidden so the +playground does not surface a routing knob to ordinary users: + +```python +uri: Optional[str] = Field( + default=None, + description="Optional sidecar (agent runner) address. When set, the run is routed here; " + "when unset, the server falls back to AGENTA_AGENT_RUNNER_URL / the local runner.", +) +``` + +`build_agent_v0_default(...)` is **not** changed: `uri` has no default key, so the shipped +default config is byte-identical and the env-var path stays the out-of-the-box behavior. The +agent-config-schema interface doc gets the new row (kept-in-sync requirement). + +> Open question O3 (see status): whether the field is exposed in the playground at all, or kept +> API-only / behind an advanced toggle. A routing override is an operator/testing concern, not a +> normal authoring field. The reviewer should decide. + +## 3. Routing prefers it + +`select_backend` in `services/oss/src/agent/app.py` prefers `selection.uri`, then `runner_url()`, +then the local CLI — and runs the value through the security check (step 4) before trusting it: + +```python +def select_backend(selection: RunSelection) -> Backend: + url = resolve_runner_url(selection.uri) # allowlist-checked override, else env, else None + return SandboxAgentBackend( + sandbox=selection.sandbox, + url=url, + cwd=str(runner_dir()), + ) +``` + +`resolve_runner_url(override)` (new, in `services/oss/src/agent/config.py`): + +```python +def resolve_runner_url(override: Optional[str]) -> Optional[str]: + """Routing precedence: a validated request override, else AGENTA_AGENT_RUNNER_URL, + else None (local CLI). An override must pass the allowlist check (see security.md); + a rejected override raises rather than silently falling back.""" + if override: + return validate_runner_uri(override) # raises on a disallowed address + return runner_url() +``` + +Precedence, explicitly: **`selection.uri` (validated) → `AGENTA_AGENT_RUNNER_URL` → local CLI in +`AGENTA_AGENT_RUNNER_DIR`.** Unset `uri` is exactly today's behavior. + +A rejected override **errors** (it does not silently fall back to the env var). Falling back +would let a caller probe the allowlist by difference and would mask a misconfiguration; failing +loud is the not-implemented/unsupported pattern the codebase already uses for capability gates. + +## 4. The security restriction (load-bearing — see security.md) + +A caller-supplied address is an SSRF / secret-exfiltration risk because the service ships +resolved provider keys and bearer tokens to whatever URL it picks. So the override is gated +**server-side**, default-off: + +- New env var `AGENTA_AGENT_RUNNER_URI_ALLOWLIST` (comma-separated origins/hosts), added to + `api/oss/src/utils/env.py` and read via the shared `env` object (never raw `os.getenv` for app + config, per repo convention). +- `validate_runner_uri(uri)` accepts the override **only** when its origin is in the allowlist; + otherwise it raises a typed error (`UnsupportedRunnerUriError` or similar), surfaced as a 4xx. +- **Default: allowlist empty → every override rejected.** Out of the box, only the env-var path + works. An operator opts in by listing trusted sidecar origins. + +The exact allowlist shape (origins vs hosts, scheme/port handling, whether loopback is +implicitly allowed) is detailed in [security.md](security.md). The deeper transport hardening +(auth token, TLS) is the sidecar-trust project's scope, not this one. + +## 5. Tests + +Extend `services/oss/tests/pytest/unit/agent/test_select_backend.py` (the existing precedence +suite) plus an SDK `RunSelection` parse test: + +- `uri` set + allowlisted → backend `_url` is the override (override beats the env var). +- `uri` set + allowlisted + `AGENTA_AGENT_RUNNER_URL` also set → override wins. +- `uri` unset → falls back to `AGENTA_AGENT_RUNNER_URL` (unchanged behavior). +- `uri` unset + env unset → local CLI subprocess (unchanged behavior). +- `uri` set but NOT allowlisted → raises (no silent fallback). +- `uri` set + empty allowlist (default) → raises (default-off). +- `RunSelection.from_params({"uri": " "})` → `uri is None` (blank treated as unset). +- Golden wire fixture test stays green untouched (proves `uri` is not on the `/run` wire). + +## 6. Docs to keep in sync (same PR as the eventual implementation) + +Per the `keep-docs-in-sync` skill, the implementing PR must also update: + +- `docs/design/agent-workflows/interfaces/public-edge/agent-config-schema.md` — add the `uri` + row and a note that it is run-selection, parsed by `RunSelection`, and *not* a `/run` wire + field. +- `docs/design/agent-workflows/interfaces/in-service/agent-service-handler.md` — the + `select_backend` precedence (`uri` → env → local) and the allowlist gate. +- The interface inventory index + `api/oss/src/utils/env.py` doc for the new env var. +- `../sidecar-deployment-proposal/` — note the per-run override on top of the default contract. +- `../sidecar-trust-and-sandbox-enforcement/` — note the allowlist as the control that keeps the + trusted-network assumption intact under a caller-supplied address. + +## 7. Rollout + +One slice; the surface is tiny. Pre-production, so no migration and no compat shim. The only +gate is the security decision (O1/O2 in status): ship the field **with** the allowlist default-off +in the same change, so a caller-supplied address is never trusted before the restriction exists. diff --git a/docs/design/agent-workflows/projects/sidecar-uri-config/research.md b/docs/design/agent-workflows/projects/sidecar-uri-config/research.md new file mode 100644 index 0000000000..fef224606a --- /dev/null +++ b/docs/design/agent-workflows/projects/sidecar-uri-config/research.md @@ -0,0 +1,166 @@ +# Research + +Everything below is read from the current tree. File paths and line references were verified +against the working copy on 2026-06-24. + +## 1. Where routing is decided today (the one seam) + +Routing lives in exactly one function, `select_backend`, in +`services/oss/src/agent/app.py`: + +```python +def select_backend(selection: RunSelection) -> Backend: + """Pick the backend for a run. + + The service always uses the sandbox-agent-backed runner. `AGENTA_AGENT_RUNNER_URL` + selects HTTP transport in deployed containers. When it is unset, local development + spawns the TypeScript runner CLI from the runner dir. + """ + return SandboxAgentBackend( + sandbox=selection.sandbox, + url=runner_url(), + cwd=str(runner_dir()), + ) +``` + +It is called once, in the handler `_agent`: + +```python +harness = make_harness(selection.harness, Environment(select_backend(selection))) +``` + +So the seam already receives the parsed `RunSelection`. Adding a `uri` to `RunSelection` and +preferring it inside `select_backend` is a localized change — no new plumbing, no new call site. + +## 2. The env-var resolution it would fall back to + +`services/oss/src/agent/config.py`: + +```python +def runner_dir() -> Path: + """Directory of the TypeScript agent runner (where the CLI command runs).""" + override = os.getenv("AGENTA_AGENT_RUNNER_DIR") + return Path(override) if override else _DEFAULT_AGENT_DIR + +def runner_url() -> Optional[str]: + """HTTP URL for the deployed agent runner service, when configured.""" + value = os.getenv("AGENTA_AGENT_RUNNER_URL") + return value.strip() if value and value.strip() else None +``` + +`SandboxAgentBackend` (`sdks/python/agenta/sdk/agents/adapters/sandbox_agent.py`) then branches +on `url`: + +- `url` set → HTTP delivery (`deliver_http` / `deliver_http_stream`) to that URL. +- `url` unset → spawn the runner CLI as a subprocess from `cwd` (`deliver_subprocess*`), + resolving `src/cli.ts` under `runner_dir()`; missing assets raise + `AgentRunnerConfigurationError`. + +So today's precedence is simply: `AGENTA_AGENT_RUNNER_URL` if set, else the local CLI in +`AGENTA_AGENT_RUNNER_DIR`. The `uri` slots in *above* `runner_url()`: +**`selection.uri` → `runner_url()` → local CLI.** + +The existing behavior is pinned by `services/oss/tests/pytest/unit/agent/test_select_backend.py` +(`test_runner_url_selects_http_transport`, `test_no_runner_url_uses_subprocess_transport`, +`test_no_runner_url_requires_runner_assets`). Those tests are the template for the new ones. + +## 3. `RunSelection` vs `AgentConfig` — where the field belongs + +`sdks/python/agenta/sdk/agents/dtos.py`: + +```python +class RunSelection(BaseModel): + """The run-time choices stored next to the agent config: which harness, which sandbox, + the permission policy. Read by the caller to pick a backend and harness class; + deliberately not part of the neutral AgentConfig.""" + + harness: str = "pi_core" + sandbox: str = "local" + permission_policy: PermissionPolicy = "auto" + + @classmethod + def from_params(cls, params, *, default_harness="pi_core", default_sandbox="local"): + agent = params.get("agent") + source = agent if isinstance(agent, dict) else params + return cls( + harness=str(source.get("harness") or default_harness).lower(), + sandbox=str(source.get("sandbox") or default_sandbox).lower(), + permission_policy=str(source.get("permission_policy") or "auto").lower(), + ) +``` + +This is the natural home: `uri` is *where a run goes*, the same category of fact as `sandbox`. +The class docstring even says these choices are "stored next to the agent config" and +"deliberately not part of the neutral `AgentConfig`." A sidecar address is exactly that — a +deployment/routing choice, never *what the agent is*. + +`AgentConfig` (same file) is the neutral model (instructions/model/tools/mcp_servers/skills/ +harness_options/sandbox_permission). A sidecar address does not belong there. + +## 4. The schema the playground renders, and the parse path + +`AgentConfigSchema` in `sdks/python/agenta/sdk/utils/types.py` is the strict schema shipped on +`/inspect` as the `agent_config` catalog type. It carries the run-selection trio for editing: + +```python +harness: str = Field(...) +sandbox: Literal["local", "daytona"] = Field(...) +permission_policy: Literal["auto", "deny"] = Field(...) +sandbox_permission: Optional[SandboxPermission] = Field(...) +``` + +The agent-config-schema interface doc confirms the split (line 32): + +> Note that `harness`, `sandbox`, and `permission_policy` are the run selection. The handler +> reads them from the same `parameters` object via `RunSelection.from_params(...)`, not just +> from `AgentConfig`. + +The default config has a single source, `build_agent_v0_default(...)` in +`sdks/python/agenta/sdk/utils/types.py`, consumed by both the SDK builtin interface and the +service `/inspect`. A new field that needs a default would be added there once. **`uri` should +default to unset (no key emitted)**, so the default config does not change and the env-var path +stays the out-of-the-box behavior. + +## 5. The `/run` wire — `uri` is NOT a wire field + +`request_to_wire` in `sdks/python/agenta/sdk/agents/utils/wire.py` builds the `/run` body. It +carries `harness`, `sandbox`, messages, config (tools/prompt/permission/harness-files), +`secrets`, `trace`, `session_id`. It does **not** carry a runner URL — the URL is the transport +target, decided *before* the wire is built, when `SandboxAgentBackend` is constructed. + +So `uri` is consumed entirely on the service side, in `select_backend`. It never reaches the +runner. The golden wire fixtures +(`sdks/python/oss/tests/pytest/unit/agents/golden/`) stay byte-identical. This is the cleanest +property of the design: a routing override with zero wire-contract cost. + +## 6. What crosses to the chosen sidecar (the security input) + +This is why the address matters. The `/run` body the service POSTs to whatever URL it picks can +carry (per the sidecar-trust research and `wire.py`): + +- **`secrets`** — resolved provider credentials (OpenAI/Anthropic keys, full AWS/GCP/Azure + credential groups), assembled server-side by the connection resolver. +- **`toolCallback.authorization`** — a bearer token the runner uses to call gateway tools back + on the platform. +- **`trace.authorization`** — the caller's `Authorization` header value, reused to export OTel + spans. + +The transport sets **no auth and does no TLS** (`deliver_http` does a plain +`client.post(url, json=payload)`). So whatever address `select_backend` resolves, the service +will ship real credentials and reusable bearer tokens to it in plaintext. **If a caller can +choose that address, a caller can choose where the service sends its secrets.** That is the +load-bearing security fact; see [security.md](security.md). + +## 7. Related work this sits next to + +- **sidecar-deployment-proposal** (`../sidecar-deployment-proposal/`) — owns the *default* + runner deployment contract: `AGENTA_AGENT_RUNNER_URL` defaulting to + `http://sandbox-agent:8765`, the Compose/Helm/Railway wiring, the `sandbox-agent` service + naming. This `uri` is the *per-run override* layered on that default. The two are + complementary: the proposal sets the fallback; this sets the optional override. +- **sidecar-trust-and-sandbox-enforcement** (`../sidecar-trust-and-sandbox-enforcement/`) — + documents that `/run` is unauthenticated plaintext on an assumed-trusted network, and that the + near-term hardening is (1) loopback/in-cluster binding and (2) an optional shared `/run` + token. A caller-supplied `uri` directly stresses assumption (1): it lets the chosen address + leave the trusted network. The restriction in [security.md](security.md) is designed to keep + that assumption intact. diff --git a/docs/design/agent-workflows/projects/sidecar-uri-config/security.md b/docs/design/agent-workflows/projects/sidecar-uri-config/security.md new file mode 100644 index 0000000000..ac61a2fea6 --- /dev/null +++ b/docs/design/agent-workflows/projects/sidecar-uri-config/security.md @@ -0,0 +1,85 @@ +# Security + +The whole design hinges on one fact: **a sidecar address chosen by the caller is an address the +service will send its secrets to.** This page is the threat model and the proposed restriction. +It is the load-bearing decision; the field itself is trivial without it. + +## Why a caller-supplied address is dangerous + +The service builds every `/run` body server-side and POSTs it, over plain HTTP with no auth, to +whatever URL `select_backend` resolves. That body can carry (verified in research §6 and the +sidecar-trust project): + +- **`secrets`** — resolved provider credentials: OpenAI / Anthropic API keys and the full + AWS / GCP / Azure credential groups, picked by the connection resolver from the project vault. +- **`toolCallback.authorization`** — a bearer token the runner uses to call gateway tools back + on the platform. +- **`trace.authorization`** — the caller's own `Authorization` header value, reused for OTel + export. + +If the address comes from the request, then a request controls **where real provider keys and +two reusable bearer tokens are shipped, in plaintext**. Concretely: + +| Threat | What an attacker gets | +| --- | --- | +| **Secret exfiltration** | Point `uri` at an attacker-controlled host; the service POSTs the vault-resolved provider keys + bearer tokens straight to it. | +| **SSRF** | Point `uri` at an internal address (`http://169.254.169.254/...`, a cluster service, `http://localhost:`); the service makes the request from inside the trust boundary, with the bearer token attached. | +| **Trusted-network bypass** | The sidecar-trust project's near-term defense is "the sidecar only lives on a trusted network." A free-form caller `uri` lets the run leave that network entirely, voiding the assumption. | + +This is why the reviewer's "the sandbox should probably use this uri to route" must be paired +with a server-side gate. The convenience (per-run runner selection) and the risk (per-run +secret destination) are the same mechanism. + +## The restriction: a server-side allowlist, default-off + +The override is honored **only** when its address is on a server-configured allowlist. The +caller proposes; the server disposes. + +- **`AGENTA_AGENT_RUNNER_URI_ALLOWLIST`** — a comma-separated list of trusted sidecar origins, + added to `api/oss/src/utils/env.py` and read via the shared `env` object (repo rule: no raw + `os.getenv` for app config). +- **Default empty → every override rejected.** Out of the box, only `AGENTA_AGENT_RUNNER_URL` / + the local CLI work; the `uri` field is inert until an operator opts in. This is the safe + default: the feature ships off. +- `validate_runner_uri(uri)` compares the override's **origin** (scheme + host + port) against + the allowlist. A match is honored; a miss raises a typed error surfaced as a 4xx. It does + **not** silently fall back to the env var (that would let a caller probe the allowlist by + difference and would mask a misconfiguration). + +### Allowlist matching details to settle + +- **Match on origin, not substring.** Parse the URL and compare scheme+host+port exactly, so + `http://evil.com/?x=http://trusted` cannot smuggle a trusted substring past the check. +- **Block link-local / metadata by default.** Even with an empty allowlist this is moot (all + rejected), but if an operator adds a broad entry, reject `169.254.0.0/16`, + `100.64.0.0/10` (and the IPv6 equivalents) unless explicitly listed. SSRF defense in depth. +- **Loopback handling.** Decide whether `127.0.0.1` / `localhost` is implicitly allowed (it is + the common single-host dev case) or must be listed like any other origin. Leaning: require it + to be listed too, so the default is uniformly "nothing is trusted." (Open question O2.) +- **Scheme.** Restrict to `http`/`https`; reject `file:`, `gopher:`, etc. + +## Relationship to the sidecar-trust project + +This restriction is a **complement**, not a replacement, for the +[sidecar-trust-and-sandbox-enforcement](../sidecar-trust-and-sandbox-enforcement/README.md) +near-term hardening: + +- That project's **step 1** (loopback / in-cluster-only binding) assumes the *set* of reachable + sidecars is fixed by the network. A caller `uri` would break that assumption — the allowlist + is exactly what re-establishes it at the application layer: the run can only target an address + an operator pre-approved. +- That project's **step 2** (an optional shared `/run` token, `AGENTA_AGENT_RUNNER_TOKEN`) + protects *any* address the service calls, including an allowlisted override. The two stack: the + allowlist decides *which* sidecars are legal destinations; the token authenticates the service + *to* that sidecar. +- The heavier items there (TLS / mTLS, short-lived scoped tokens, payload encryption) reduce the + damage if an allowlisted address is ever compromised. They remain that project's deferred + scope; this project does not design them. + +## Recommendation + +Ship the `uri` field and the allowlist gate **together, in the same change**, with the allowlist +**default-empty (feature off)**. A caller-supplied address must never be trusted before the +restriction exists. If the reviewer prefers the field to be operator-only (not playground- +editable) the risk surface shrinks further but the server-side allowlist is still required, +because a direct API caller bypasses the playground entirely. diff --git a/docs/design/agent-workflows/projects/sidecar-uri-config/status.md b/docs/design/agent-workflows/projects/sidecar-uri-config/status.md new file mode 100644 index 0000000000..7fc734f7d6 --- /dev/null +++ b/docs/design/agent-workflows/projects/sidecar-uri-config/status.md @@ -0,0 +1,69 @@ +# Status + +## State + +- **IMPLEMENTED.** Spun out of PR #4821 review comment + [3469613625](https://github.com/Agenta-AI/agenta/pull/4821#discussion_r3469613625); design + reviewed in PR #4836, then built per the reviewer's D7 decision. +- Built on [#4840](https://github.com/Agenta-AI/agenta/pull/4840) (config-structure cleanup), + which retired `RunSelection` and put the run-selection fields on `AgentConfig`. The original + plan/research below were written against `RunSelection`; the field now lives on `AgentConfig`. + +## Decisions taken + +- **D1 — `uri` lives on `AgentConfig`** (was `RunSelection` in the original design; #4840 retired + that class). It is *where a run goes*, parsed in the one `AgentConfig.from_params`. +- **D2 — Not a `/run` wire field.** `uri` is consumed entirely service-side in `select_backend`; + the runner never receives it. Golden wire fixtures stay byte-identical. +- **D3 — Precedence: `agent_config.uri` (validated) → `AGENTA_AGENT_RUNNER_URL` → local CLI.** + Unset `uri` is exactly today's behavior. +- **D4 — Allowlist gate, default-off.** `AGENTA_AGENT_RUNNER_URI_ALLOWLIST` (default empty = every + override rejected). A rejected override fails loud (no silent fallback). Origin match (not + substring), `http`/`https` only. See [security.md](security.md). +- **D5 — No back-compat / migration.** Pre-production POC. +- **D7 (reviewer) — `uri` REPLACES `sandbox`.** The `sandbox` field is removed from `AgentConfig`, + `AgentConfigSchema`, `build_agent_v0_default`, and the FE control. The sidecar address drives + routing; each sidecar picks its own sandbox provider (local/Daytona) from its own env + (`SANDBOX_AGENT_PROVIDER`). No per-run sandbox selector remains. + +## What changed (this PR) + +- **SDK** (`sdks/python/agenta/sdk/agents/dtos.py`): removed `AgentConfig.sandbox`, added + `uri: Optional[str]`; `_parse_run_selection` parses `uri` (trim-or-`None` via `_clean_uri`, + not case-folded); docstrings updated. +- **SDK schema** (`sdks/python/agenta/sdk/utils/types.py`): removed the `sandbox` field + + `_DEFAULT_SANDBOX` from `AgentConfigSchema`; added an optional `uri` field; `build_agent_v0_default` + no longer emits `sandbox` (and omits `uri`, so the default config carries no routing override). +- **Service** (`services/oss/src/agent/config.py`): `AGENTA_AGENT_RUNNER_URI_ALLOWLIST`, + `runner_uri_allowlist()`, `validate_runner_uri()`, `resolve_runner_url()`, and the typed + `UnsupportedRunnerUriError`. +- **Service handler** (`services/oss/src/agent/app.py`): `select_backend` routes by + `resolve_runner_url(agent_config.uri)` and sends a constant `sandbox="local"` default; + `RuntimeAuthContext.backend` no longer set from `sandbox`. +- **FE**: removed the sandbox `EnumSelectControl` from `AgentConfigControl.tsx`; dropped the + `sandbox` default in the playground request builder (`agentRequest.ts`) and the dead `sandbox` + arm of the agent-mode heuristic (`selectors.ts`). No `uri` control is surfaced (operator/testing + concern). +- **Tests**: SDK `test_dtos_agent_config.py`, service `test_select_backend.py` (full allowlist + + precedence matrix) and `test_default_agent_config.py`, FE `agentMode.test.ts` / + `agentRequest.test.ts`. Wire-contract golden untouched (proves `uri` is not on the wire). +- **Docs**: `agent-config-schema.md`, `agent-service-handler.md`, `service-to-agent-runner.md`, + `running-the-agent.md` (new env var), and this workspace. + +## Deferred + +- **Wire-level `sandbox` removal.** The `/run` wire still carries a constant `sandbox: "local"` + the unchanged runner defaults correctly on. Fully removing the field would force a + `protocol.ts` / golden / runner-branch change, so it is left as a runner-branch follow-up + (this slice is config/service/FE only and keeps the golden unchanged, per POC scope). +- The deeper transport hardening (auth token, TLS, scoped tokens) stays in the + [sidecar-trust-and-sandbox-enforcement](../sidecar-trust-and-sandbox-enforcement/) project. + +## Resolved open questions + +- **O1 (allowlist as the gate)** → yes, `AGENTA_AGENT_RUNNER_URI_ALLOWLIST`, default-off. +- **O2 (loopback implicit?)** → no implicit loopback; even `127.0.0.1` must be listed + (uniform "nothing trusted" default). +- **O3 (playground-visible?)** → API-only; the playground surfaces no `uri` control. +- **O4 (reject vs silent fallback)** → reject (fail loud), confirmed. +- **O5 (field/env name)** → `uri` on the config, `AGENTA_AGENT_RUNNER_URI_ALLOWLIST` for the gate. diff --git a/sdks/python/agenta/sdk/agents/dtos.py b/sdks/python/agenta/sdk/agents/dtos.py index 16395bd421..a48104a1ca 100644 --- a/sdks/python/agenta/sdk/agents/dtos.py +++ b/sdks/python/agenta/sdk/agents/dtos.py @@ -407,12 +407,16 @@ class AgentConfig(BaseModel): ``AGENTS.md``. ``tools`` are provider-agnostic references; resolving them into runnable specs is the caller's job (the Agenta service does it server-side). - ``harness`` / ``sandbox`` / ``permission_policy`` are the run-selection fields: which - coding agent to drive, where it runs, and how a permission-gating harness answers tool-use - prompts in a headless run. They live on ``AgentConfig`` (under ``data.parameters.agent``) - rather than a separate object — there is one agent definition, not an agent plus a sidecar - selection. ``sandbox`` is a backend/environment concern the caller reads to pick a backend; - it never enters ``SessionConfig`` or the neutral run. + ``harness`` / ``uri`` / ``permission_policy`` are the run-selection fields: which coding + agent to drive, which sidecar (agent runner) address routes the run, and how a + permission-gating harness answers tool-use prompts in a headless run. They live on + ``AgentConfig`` (under ``data.parameters.agent``) rather than a separate object — there is + one agent definition, not an agent plus a sidecar selection. ``uri`` is a routing concern + the caller reads to pick a backend; it never enters ``SessionConfig`` or the neutral run. + The sidecar at ``uri`` is configured (local or Daytona) by its OWN environment, so the + service no longer carries a per-run sandbox selector. ``uri`` unset is the default: the + service falls back to its env-var runner resolution (``AGENTA_AGENT_RUNNER_URL`` / the local + CLI). A caller-supplied ``uri`` is gated server-side by an allowlist before it is trusted. ``harness_kwargs`` is the per-harness escape hatch: a map keyed by harness name (``"pi_core"``, ``"claude"``) whose value is a free-form bag of knobs only that harness @@ -437,12 +441,14 @@ class AgentConfig(BaseModel): skills: List[SkillConfig] = Field(default_factory=list) harness_kwargs: Dict[str, Dict[str, Any]] = Field(default_factory=dict) sandbox_permission: Optional[SandboxPermission] = None - # The run-selection fields (formerly the separate ``RunSelection``): the coding agent to - # drive, where it runs, and the headless permission policy. The caller reads ``harness`` / - # ``sandbox`` to pick a harness class and backend; ``permission_policy`` is the sidecar - # action-permission a gating harness (Claude) consults. + # The run-selection fields: the coding agent to drive, the sidecar address that routes the + # run, and the headless permission policy. The caller reads ``harness`` to pick a harness + # class and ``uri`` to pick the runner backend; ``permission_policy`` is the sidecar + # action-permission a gating harness (Claude) consults. ``uri`` unset (the default) means the + # service falls back to its env-var runner resolution; a set ``uri`` is allowlist-gated + # server-side before it is trusted (it ships secrets to that address). harness: str = "pi_core" - sandbox: str = "local" + uri: Optional[str] = None permission_policy: PermissionPolicy = "auto" @model_validator(mode="before") @@ -478,12 +484,12 @@ def from_params( playground ``prompt`` prompt-template (system message -> instructions, ``llm_config`` -> model + tools), and a flat ``{model, agents_md, tools}``. Unset fields fall back to ``defaults``. ``harness_kwargs`` and the run-selection fields - (``harness`` / ``sandbox`` / ``permission_policy``) are read from the ``agent`` element + (``harness`` / ``uri`` / ``permission_policy``) are read from the ``agent`` element (or the flat request) when present. """ base = defaults or cls() instructions, model, tools = _parse_agent_fields(params, base) - harness, sandbox, permission_policy = _parse_run_selection(params, base) + harness, uri, permission_policy = _parse_run_selection(params, base) return cls( instructions=instructions, model=model, @@ -493,7 +499,7 @@ def from_params( harness_kwargs=_parse_harness_kwargs(params, base), sandbox_permission=_parse_sandbox_permission(params, base), harness=harness, - sandbox=sandbox, + uri=uri, permission_policy=permission_policy, ) @@ -781,9 +787,9 @@ class SessionConfig(BaseModel): ``agent`` is the agent definition. ``secrets`` are provider keys injected as harness env, never written to the agent filesystem. The ``builtin_tools`` / ``custom_tools`` / ``tool_callback`` triple is the resolved tool delivery (Agenta produces it server-side; - empty for a bare standalone run). The agent config's ``sandbox`` field is a - backend/environment concern: the caller reads it to pick a backend BEFORE the session is - built, and the run itself never consumes it (no adapter reads ``agent.sandbox``).""" + empty for a bare standalone run). The agent config's ``uri`` field is a routing concern: + the caller reads it to pick a backend BEFORE the session is built, and the run itself never + consumes it (no adapter reads ``agent.uri``).""" model_config = ConfigDict(populate_by_name=True) @@ -931,23 +937,36 @@ def _parse_harness_kwargs( return options +def _clean_uri(value: Any) -> Optional[str]: + """Trim a sidecar URI to a non-empty string, else ``None``. + + Mirrors the service's ``runner_url()`` trim-or-``None`` behavior so a blank string never + counts as "set" and an unset ``uri`` cleanly takes the env-var fallback path.""" + if not isinstance(value, str): + return None + trimmed = value.strip() + return trimmed or None + + def _parse_run_selection( params: Dict[str, Any], defaults: AgentConfig, -) -> Tuple[str, str, "PermissionPolicy"]: - """Pull the run-selection trio (harness / sandbox / permission_policy) from a request dict. +) -> Tuple[str, Optional[str], "PermissionPolicy"]: + """Pull the run-selection trio (harness / uri / permission_policy) from a request dict. Reads from the ``agent`` element when present, else the flat request, falling back to - ``defaults``. Each value is lower-cased so a playground-supplied ``"Claude"`` / ``"Daytona"`` - matches the bare :class:`HarnessType` / sandbox values the caller selects on.""" + ``defaults``. ``harness`` / ``permission_policy`` are lower-cased so a playground-supplied + ``"Claude"`` / ``"Deny"`` matches the bare values the caller selects on. ``uri`` is trimmed + (blank -> ``None``) but never case-folded (an address is case-sensitive).""" agent = params.get("agent") source = agent if isinstance(agent, dict) else params harness = str(source.get("harness") or defaults.harness).lower() - sandbox = str(source.get("sandbox") or defaults.sandbox).lower() + raw_uri = source.get("uri") + uri = _clean_uri(raw_uri) if raw_uri is not None else defaults.uri permission_policy = str( source.get("permission_policy") or defaults.permission_policy ).lower() - return harness, sandbox, permission_policy + return harness, uri, permission_policy def _parse_sandbox_permission( diff --git a/sdks/python/agenta/sdk/utils/types.py b/sdks/python/agenta/sdk/utils/types.py index 59db5cb391..cce1769ba8 100644 --- a/sdks/python/agenta/sdk/utils/types.py +++ b/sdks/python/agenta/sdk/utils/types.py @@ -1068,7 +1068,6 @@ def _model_catalog_type() -> dict: # `AgentConfig.from_params` falls back to) both consume these via `build_agent_v0_default`, so a new # default changes one place. The harness default also seeds `AgentConfigSchema.harness`. _DEFAULT_HARNESS = "pi_core" -_DEFAULT_SANDBOX = "local" _DEFAULT_PERMISSION_POLICY = "auto" # The schema key carrying each harness option's versioned slug identity (the contract identity in @@ -1104,7 +1103,7 @@ class AgentConfigSchema(AgSchemaMixin): field shapes live in Pydantic (single source of truth) instead of a hand-written literal. It composes every editable field the control surfaces — the definition (``agents_md``/``model``/``tools``/``mcp_servers``) and the run-selection fields - (``harness``/``sandbox``/``permission_policy``), all one config — and types + (``harness``/``uri``/``permission_policy``), all one config — and types ``tools``/``mcp_servers`` with the real tool-def models so the playground gets typed editors. The runtime ``AgentConfig`` stays permissive (``List[Any]``) because its job is to coerce the loose shapes the playground emits; this model is strict because its job is to describe them. @@ -1153,10 +1152,16 @@ class AgentConfigSchema(AgSchemaMixin): ), json_schema_extra=_harness_field_schema_extra(), ) - sandbox: Literal["local", "daytona"] = Field( - default="local", - title="Sandbox", - description="Where the agent runs: local daemon or a Daytona sandbox.", + uri: Optional[str] = Field( + default=None, + title="Sidecar URI", + description=( + "Optional sidecar (agent runner) address that routes this run. When set, the " + "server routes the run to this address (gated by a server-side allowlist); when " + "unset, the server falls back to its env-var runner resolution. The sidecar is " + "configured local or Daytona by its own environment, not per-run. Operator/testing " + "concern, not a normal authoring field." + ), ) permission_policy: Literal["auto", "deny"] = Field( default="auto", @@ -1194,9 +1199,11 @@ def build_agent_v0_default( """The default `agent_config` value, shared by the builtin interface and the service. Base shape (always): instructions, model, empty tools/MCP, the run selection - (harness/sandbox/permission policy). ``include_sandbox_permission`` adds the declared - Layer-2 boundary the playground pre-fills (network egress on, strict). ``skill_slug`` adds - one ``@ag.embed`` reference to a stored skill the backend inlines before the runner sees it + (harness/permission policy). ``uri`` is deliberately omitted from the default (unset means + the env-var runner fallback, the out-of-the-box behavior), so the shipped default config + carries no routing override. ``include_sandbox_permission`` adds the declared Layer-2 + boundary the playground pre-fills (network egress on, strict). ``skill_slug`` adds one + ``@ag.embed`` reference to a stored skill the backend inlines before the runner sees it (the service passes the reserved platform default skill; the SDK builtin passes none).""" config: Dict[str, Any] = { "agents_md": _DEFAULT_AGENTS_MD, @@ -1204,7 +1211,6 @@ def build_agent_v0_default( "tools": [], "mcp_servers": [], "harness": _DEFAULT_HARNESS, - "sandbox": _DEFAULT_SANDBOX, "permission_policy": _DEFAULT_PERMISSION_POLICY, } if include_sandbox_permission: diff --git a/sdks/python/oss/tests/pytest/unit/agents/test_dtos_agent_config.py b/sdks/python/oss/tests/pytest/unit/agents/test_dtos_agent_config.py index d0913e03df..13154387ed 100644 --- a/sdks/python/oss/tests/pytest/unit/agents/test_dtos_agent_config.py +++ b/sdks/python/oss/tests/pytest/unit/agents/test_dtos_agent_config.py @@ -2,8 +2,8 @@ The handler parses whatever the playground or a stored config sends into one ``AgentConfig``. This file locks the three accepted shapes, the defaults fall-through, the ``harness_kwargs`` -escape hatch, and the run-selection parsing (``harness`` / ``sandbox`` / ``permission_policy``, -which now live on ``AgentConfig`` rather than a separate ``RunSelection``). +escape hatch, and the run-selection parsing (``harness`` / ``uri`` / ``permission_policy``, +which live on ``AgentConfig`` rather than a separate ``RunSelection``). """ from __future__ import annotations @@ -186,9 +186,9 @@ def test_harness_kwargs_explicit_empty_clears_defaults(): def test_run_selection_defaults(): config = AgentConfig.from_params({}) - assert (config.harness, config.sandbox, config.permission_policy) == ( + assert (config.harness, config.uri, config.permission_policy) == ( "pi_core", - "local", + None, "auto", ) @@ -198,25 +198,36 @@ def test_run_selection_reads_agent_subdict_and_lowercases(): { "agent": { "harness": "Claude", - "sandbox": "Daytona", + "uri": "http://Sidecar:8765", "permission_policy": "Deny", } } ) - assert (config.harness, config.sandbox, config.permission_policy) == ( + # harness/permission_policy are case-folded; uri is not (an address is case-sensitive). + assert (config.harness, config.uri, config.permission_policy) == ( "claude", - "daytona", + "http://Sidecar:8765", "deny", ) def test_run_selection_honors_defaults(): - defaults = AgentConfig(harness="claude", sandbox="daytona") + defaults = AgentConfig(harness="claude", uri="http://default-sidecar:8765") config = AgentConfig.from_params({}, defaults=defaults) assert config.harness == "claude" - assert config.sandbox == "daytona" + assert config.uri == "http://default-sidecar:8765" def test_run_selection_reads_flat_request(): config = AgentConfig.from_params({"harness": "claude"}) assert config.harness == "claude" + + +def test_uri_blank_is_treated_as_unset(): + config = AgentConfig.from_params({"agent": {"uri": " "}}) + assert config.uri is None + + +def test_uri_trimmed(): + config = AgentConfig.from_params({"agent": {"uri": " http://sidecar:8765 "}}) + assert config.uri == "http://sidecar:8765" diff --git a/services/oss/src/agent/app.py b/services/oss/src/agent/app.py index 54bf3e136b..49205e6bbd 100644 --- a/services/oss/src/agent/app.py +++ b/services/oss/src/agent/app.py @@ -8,8 +8,9 @@ from the config's run-selection fields, and records the run's usage. The sandbox-agent-backed backend is the production path. The transport is a deployment -choice: HTTP to `AGENTA_AGENT_RUNNER_URL`, or a local runner CLI in a source checkout. -The harness, sandbox, and permission policy are editable fields on the agent config. +choice: the agent config's ``uri`` (allowlist-gated) routes the run, else HTTP to +`AGENTA_AGENT_RUNNER_URL`, else a local runner CLI in a source checkout. The harness, the +sidecar ``uri``, and the permission policy are editable fields on the agent config. """ from typing import Any, Dict, List, Optional @@ -55,7 +56,7 @@ from agenta.sdk.utils.logging import get_module_logger -from oss.src.agent.config import load_config, runner_dir, runner_url +from oss.src.agent.config import load_config, resolve_runner_url, runner_dir from oss.src.agent.schemas import AGENT_SCHEMAS from oss.src.agent.tools import resolve_mcp_servers, resolve_tools from oss.src.agent.tracing import record_usage, trace_context @@ -183,17 +184,25 @@ async def _resolve_session_connection( return resolved -def select_backend(agent_config: AgentConfig) -> Backend: - """Pick the backend for a run from the agent config's run-selection fields. +# The wire still carries a ``sandbox`` field the unchanged runner reads; the per-run sandbox +# selector is gone (each sidecar is configured local-or-Daytona by its own env), so the service +# sends this safe default. Removing the wire field entirely is a runner-branch follow-up. +_DEFAULT_SANDBOX = "local" + - The service always uses the sandbox-agent-backed runner. `AGENTA_AGENT_RUNNER_URL` - selects HTTP transport in deployed containers. When it is unset, local development - spawns the TypeScript runner CLI from the runner dir. Only ``sandbox`` is read here; - it is a backend/environment concern that never enters ``SessionConfig``. +def select_backend(agent_config: AgentConfig) -> Backend: + """Pick the runner backend for a run, routing by the agent config's ``uri``. + + Routing precedence: the config's ``uri`` (validated against the server-side allowlist) wins + when set; otherwise the env var ``AGENTA_AGENT_RUNNER_URL`` selects HTTP transport in deployed + containers; otherwise local development spawns the TypeScript runner CLI from the runner dir. + A set-but-disallowed ``uri`` raises (no silent fallback). The sidecar at the resolved address + is configured local-or-Daytona by its OWN environment, so the service sends a constant + ``sandbox`` default rather than a per-run selector. """ return SandboxAgentBackend( - sandbox=agent_config.sandbox, - url=runner_url(), + sandbox=_DEFAULT_SANDBOX, + url=resolve_runner_url(agent_config.uri), cwd=str(runner_dir()), ) @@ -223,9 +232,9 @@ async def _agent( resolved_connection: Optional[ResolvedConnection] = None secrets: Dict[str, str] = {} if model_ref is not None: - ctx = RuntimeAuthContext( - harness=agent_config.harness, backend=agent_config.sandbox - ) + # ``backend`` (sandbox provider) is no longer a per-run selector; the capability check is + # harness-only, so it is left unset here. + ctx = RuntimeAuthContext(harness=agent_config.harness) resolved_connection = await _resolve_session_connection(model_ref, ctx) secrets = resolved_connection.env diff --git a/services/oss/src/agent/config.py b/services/oss/src/agent/config.py index 0f5814f6a9..a9f2f0cdbc 100644 --- a/services/oss/src/agent/config.py +++ b/services/oss/src/agent/config.py @@ -10,6 +10,7 @@ from dataclasses import dataclass, field from pathlib import Path from typing import Any, List, Optional +from urllib.parse import urlsplit # services/oss/src/agent/config.py -> parents[3] == services/ _SERVICES_DIR = Path(__file__).resolve().parents[3] @@ -49,6 +50,68 @@ def runner_url() -> Optional[str]: return value.strip() if value and value.strip() else None +class UnsupportedRunnerUriError(Exception): + """A caller-supplied sidecar ``uri`` is not on the server-side allowlist. + + Raised (rather than silently falling back to the env var) so a disallowed address fails + loud: a silent fallback would let a caller probe the allowlist by difference and would mask + a misconfiguration. The handler surfaces this as a 4xx.""" + + def __init__(self, uri: str) -> None: + self.uri = uri + super().__init__( + f"sidecar uri {uri!r} is not allowed; " + "add its origin to AGENTA_AGENT_RUNNER_URI_ALLOWLIST" + ) + + +def runner_uri_allowlist() -> List[str]: + """The allowlist of trusted sidecar origins (``scheme://host[:port]``), default empty. + + Read from ``AGENTA_AGENT_RUNNER_URI_ALLOWLIST`` (comma-separated). Empty (the default) + means the feature is OFF: every caller-supplied ``uri`` is rejected and only the env-var / + local-CLI path runs. An operator opts in by listing trusted sidecar origins.""" + raw = os.getenv("AGENTA_AGENT_RUNNER_URI_ALLOWLIST") or "" + return [_origin(entry) for entry in raw.split(",") if entry.strip()] + + +def _origin(value: str) -> str: + """The ``scheme://host[:port]`` origin of a URL, lower-cased scheme+host (port kept as-is). + + Matching is on origin, not substring, so ``http://evil.com/?x=http://trusted`` cannot + smuggle a trusted substring past the check.""" + parts = urlsplit(value.strip()) + netloc = parts.netloc or parts.path # tolerate a bare host with no scheme + return f"{parts.scheme.lower()}://{netloc.lower()}" + + +def validate_runner_uri(uri: str) -> str: + """Return ``uri`` when its origin is on the allowlist; raise otherwise. + + SSRF / secret-exfiltration gate: the service ships resolved provider keys and bearer tokens + to whatever address it picks, so a caller-supplied address is honored only when an operator + pre-approved its origin. Restricts the scheme to ``http``/``https`` (rejects ``file:`` etc.) + and matches the parsed origin exactly against the allowlist.""" + parts = urlsplit(uri.strip()) + if parts.scheme.lower() not in ("http", "https") or not parts.netloc: + raise UnsupportedRunnerUriError(uri) + if _origin(uri) not in runner_uri_allowlist(): + raise UnsupportedRunnerUriError(uri) + return uri.strip() + + +def resolve_runner_url(override: Optional[str]) -> Optional[str]: + """Routing precedence: a validated request override, else the env var, else ``None``. + + ``override`` (the agent config's ``uri``) wins when set and allowlisted; a rejected override + raises (``UnsupportedRunnerUriError``) rather than falling back. When ``override`` is unset + this is exactly today's behavior: ``AGENTA_AGENT_RUNNER_URL`` if set, else ``None`` (the + local runner CLI in ``AGENTA_AGENT_RUNNER_DIR``).""" + if override: + return validate_runner_uri(override) + return runner_url() + + def config_dir() -> Path: """Directory holding AGENTS.md and agent.json.""" override = os.getenv("AGENTA_AGENT_CONFIG_DIR") diff --git a/services/oss/tests/pytest/unit/agent/test_default_agent_config.py b/services/oss/tests/pytest/unit/agent/test_default_agent_config.py index aa1e495327..858724f302 100644 --- a/services/oss/tests/pytest/unit/agent/test_default_agent_config.py +++ b/services/oss/tests/pytest/unit/agent/test_default_agent_config.py @@ -56,7 +56,8 @@ def test_inspect_default_parses_into_the_runtime_selection(): config.sandbox_permission is not None ) # the service boundary survives the parse assert config.harness == "pi_core" - assert config.sandbox == "local" + # uri is omitted from the default config, so it parses to None (the env-var fallback path). + assert config.uri is None assert config.permission_policy == "auto" diff --git a/services/oss/tests/pytest/unit/agent/test_select_backend.py b/services/oss/tests/pytest/unit/agent/test_select_backend.py index e81e9ce550..aac3fd0156 100644 --- a/services/oss/tests/pytest/unit/agent/test_select_backend.py +++ b/services/oss/tests/pytest/unit/agent/test_select_backend.py @@ -1,4 +1,12 @@ -"""``select_backend``: the service always uses the sandbox-agent runner backend.""" +"""``select_backend``: the service routes a run to the sandbox-agent runner. + +Routing precedence is the agent config's ``uri`` (allowlist-gated) -> ``AGENTA_AGENT_RUNNER_URL`` +-> the local runner CLI. A caller-supplied ``uri`` is honored only when its origin is on +``AGENTA_AGENT_RUNNER_URI_ALLOWLIST`` (default empty = feature off, every override rejected). A +disallowed ``uri`` fails loud (no silent fallback). The per-run sandbox selector is gone: the +sidecar is configured local-or-Daytona by its own env, so the backend always carries the constant +``local`` sandbox default. +""" from __future__ import annotations @@ -13,6 +21,7 @@ ) from oss.src.agent.app import select_backend +from oss.src.agent.config import UnsupportedRunnerUriError @pytest.fixture @@ -25,38 +34,37 @@ def runner_wrapper(tmp_path: Path) -> Path: @pytest.fixture(autouse=True) def _clean_env(monkeypatch, runner_wrapper: Path): - # Start every case from a known-empty deployment environment. + # Start every case from a known-empty deployment environment (no env URL, empty allowlist). monkeypatch.delenv("AGENTA_AGENT_RUNNER_URL", raising=False) + monkeypatch.delenv("AGENTA_AGENT_RUNNER_URI_ALLOWLIST", raising=False) monkeypatch.setenv("AGENTA_AGENT_RUNNER_DIR", str(runner_wrapper)) -def _sel(harness="pi_core", sandbox="local"): - return AgentConfig(harness=harness, sandbox=sandbox) +def _sel(harness="pi_core", uri=None): + return AgentConfig(harness=harness, uri=uri) @pytest.mark.parametrize("harness", ["pi_core", "pi_agenta", "claude"]) def test_all_harnesses_use_sandbox_agent_backend(harness): - assert isinstance(select_backend(_sel(harness, "local")), SandboxAgentBackend) - + assert isinstance(select_backend(_sel(harness)), SandboxAgentBackend) -def test_non_local_sandbox_is_threaded_through(): - backend = select_backend(_sel("pi_core", "daytona")) - assert isinstance(backend, SandboxAgentBackend) - assert backend._sandbox == "daytona" +def test_backend_always_carries_the_constant_local_sandbox(): + # No per-run sandbox selector anymore; the wire sandbox is a constant default. + assert select_backend(_sel("pi_core"))._sandbox == "local" def test_runner_url_selects_http_transport(monkeypatch): monkeypatch.setenv("AGENTA_AGENT_RUNNER_URL", "http://sandbox-agent:8765") - backend = select_backend(_sel("pi_core", "local")) + backend = select_backend(_sel("pi_core")) assert backend._url == "http://sandbox-agent:8765" def test_no_runner_url_uses_subprocess_transport(): - # Unset URL means the backend will spawn the runner CLI from a local checkout. - assert select_backend(_sel("pi_core", "local"))._url is None + # Unset URL and unset uri means the backend will spawn the runner CLI from a local checkout. + assert select_backend(_sel("pi_core"))._url is None def test_no_runner_url_requires_runner_assets(monkeypatch, tmp_path: Path): @@ -65,4 +73,69 @@ def test_no_runner_url_requires_runner_assets(monkeypatch, tmp_path: Path): monkeypatch.setenv("AGENTA_AGENT_RUNNER_DIR", str(missing_wrapper)) with pytest.raises(AgentRunnerConfigurationError, match="src/cli.ts"): - select_backend(_sel("pi_core", "local")) + select_backend(_sel("pi_core")) + + +def test_allowlisted_uri_routes_to_the_override(monkeypatch): + monkeypatch.setenv( + "AGENTA_AGENT_RUNNER_URI_ALLOWLIST", "http://trusted-sidecar:8765" + ) + + backend = select_backend(_sel("pi_core", uri="http://trusted-sidecar:8765/run")) + + assert backend._url == "http://trusted-sidecar:8765/run" + + +def test_allowlisted_uri_beats_the_env_var(monkeypatch): + monkeypatch.setenv("AGENTA_AGENT_RUNNER_URL", "http://env-sidecar:8765") + monkeypatch.setenv( + "AGENTA_AGENT_RUNNER_URI_ALLOWLIST", "http://trusted-sidecar:8765" + ) + + backend = select_backend(_sel("pi_core", uri="http://trusted-sidecar:8765")) + + assert backend._url == "http://trusted-sidecar:8765" + + +def test_uri_unset_falls_back_to_env_var(monkeypatch): + monkeypatch.setenv("AGENTA_AGENT_RUNNER_URL", "http://env-sidecar:8765") + + backend = select_backend(_sel("pi_core", uri=None)) + + assert backend._url == "http://env-sidecar:8765" + + +def test_disallowed_uri_raises_no_silent_fallback(monkeypatch): + # Env var set AND a uri given, but the uri is not allowlisted -> raise, do NOT fall back. + monkeypatch.setenv("AGENTA_AGENT_RUNNER_URL", "http://env-sidecar:8765") + monkeypatch.setenv( + "AGENTA_AGENT_RUNNER_URI_ALLOWLIST", "http://trusted-sidecar:8765" + ) + + with pytest.raises(UnsupportedRunnerUriError): + select_backend(_sel("pi_core", uri="http://evil.com:8765")) + + +def test_uri_with_empty_allowlist_is_rejected_feature_off(): + # Default: empty allowlist -> every override rejected (feature ships off). + with pytest.raises(UnsupportedRunnerUriError): + select_backend(_sel("pi_core", uri="http://trusted-sidecar:8765")) + + +def test_origin_match_not_substring(monkeypatch): + # A trusted substring smuggled in the query/path must not pass an origin check. + monkeypatch.setenv( + "AGENTA_AGENT_RUNNER_URI_ALLOWLIST", "http://trusted-sidecar:8765" + ) + + with pytest.raises(UnsupportedRunnerUriError): + select_backend( + _sel("pi_core", uri="http://evil.com/?x=http://trusted-sidecar:8765") + ) + + +def test_non_http_scheme_rejected(monkeypatch): + monkeypatch.setenv("AGENTA_AGENT_RUNNER_URI_ALLOWLIST", "file:///etc/passwd") + + with pytest.raises(UnsupportedRunnerUriError): + select_backend(_sel("pi_core", uri="file:///etc/passwd")) diff --git a/web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/AgentConfigControl.tsx b/web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/AgentConfigControl.tsx index baae5d33f6..d5e5cb2279 100644 --- a/web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/AgentConfigControl.tsx +++ b/web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/AgentConfigControl.tsx @@ -6,7 +6,9 @@ * It reuses the existing controls rather than inventing new ones: the model selector * (GroupedChoiceControl), the tool picker (ToolSelectorPopover + ToolItemControl), the MCP * server editor (McpServerItemControl), the skill editor (SkillConfigControl), enum selects - * (harness, sandbox, permission policy), and a textarea (agents_md). The field shape is the + * (harness, permission policy), and a textarea (agents_md). The sidecar `uri` is an + * operator/testing routing override (allowlist-gated server-side), not a normal authoring + * field, so the playground does not surface a control for it. The field shape is the * `agent_config` catalog type generated * from the SDK model (AgentConfigSchema in agenta.sdk.utils.types); the agent service ships a * thin `x-ag-type-ref` the playground resolves and reads back (services/oss/src/agent). @@ -669,15 +671,6 @@ export function AgentConfigControl({ disabled={disabled} /> - setField("sandbox", v)} - withTooltip={withTooltip} - disabled={disabled} - /> - { } /** - * Default the agent run-selection fields (`harness`/`sandbox`) onto the AGENT CONFIG - * (`parameters.agent`), not as top-level params siblings. They are part of one `AgentConfig` - * now, so they belong inside the `agent` block. A value the resolved config already carries - * always wins; the schema nests the config under `agent`, but a flat config (no `agent` key) - * is still defaulted at the top level so a non-schema config keeps working. + * Default the agent run-selection field `harness` onto the AGENT CONFIG (`parameters.agent`), + * not as a top-level params sibling. It is part of one `AgentConfig` now, so it belongs inside + * the `agent` block. A value the resolved config already carries always wins; the schema nests + * the config under `agent`, but a flat config (no `agent` key) is still defaulted at the top + * level so a non-schema config keeps working. The sidecar `uri` is left unset (the server's + * env-var routing fallback); it is an operator override, not a per-run default. */ // Legacy pre-migration run-selection keys that now live inside `agent`. Stripped from the // top level when `agent` is present so we never emit both wire shapes for one config. @@ -192,10 +193,10 @@ const withAgentRunDefaults = (config: Record): Record)}, + agent: {harness: "pi_core", ...(agent as Record)}, } } - return {harness: "pi_core", sandbox: "local", ...config} + return {harness: "pi_core", ...config} } const withQuery = (url: string, params: Record): string => { @@ -241,8 +242,8 @@ export async function buildAgentRequest( | Record | null | undefined - // `harness`/`sandbox` are run-selection fields on the AGENT CONFIG (`parameters.agent`), not - // top-level params siblings. Default them inside the `agent` block, never overriding values + // `harness` is a run-selection field on the AGENT CONFIG (`parameters.agent`), not a + // top-level params sibling. Default it inside the `agent` block, never overriding a value // the resolved config already carries. const parameters = pruneBlankEntries(withAgentRunDefaults(config ?? {})) as Record< string, diff --git a/web/packages/agenta-playground/src/state/execution/selectors.ts b/web/packages/agenta-playground/src/state/execution/selectors.ts index 2eec491f25..a57f06d6d6 100644 --- a/web/packages/agenta-playground/src/state/execution/selectors.ts +++ b/web/packages/agenta-playground/src/state/execution/selectors.ts @@ -1280,18 +1280,18 @@ export const isAgentModeAtomFamily = atomFamily((entityId: string) => if (get(workflowMolecule.selectors.workflowType(entityId)) === "agent") return true // Schema-marker detection — matches the left-panel agent config control, // so the generations panel agrees with the rendered config schema even - // before WP-6 sets is_agent. (The harness/sandbox heuristic below missed - // because those values live nested inside the agent_config block, not at + // before WP-6 sets is_agent. (The harness heuristic below missed + // because that value lives nested inside the agent_config block, not at // the top level of `configuration`.) if (schemaMarksAgentConfig(get(workflowMolecule.selectors.parametersSchema(entityId)))) return true - // Legacy heuristic — a stored config carrying top-level harness/sandbox. + // Legacy heuristic — a stored config carrying top-level harness. // Kept as a last resort; delete once WP-6 is the sole source of truth. const config = get(workflowMolecule.selectors.configuration(entityId)) as | Record | null | undefined - return Boolean(config?.harness || config?.sandbox) + return Boolean(config?.harness) }), ) diff --git a/web/packages/agenta-playground/tests/unit/agentMode.test.ts b/web/packages/agenta-playground/tests/unit/agentMode.test.ts index 9e4322e25f..b370bf31c2 100644 --- a/web/packages/agenta-playground/tests/unit/agentMode.test.ts +++ b/web/packages/agenta-playground/tests/unit/agentMode.test.ts @@ -9,11 +9,11 @@ * - per-entity: two entities in the same store resolve independently (mixed * comparison grids must not misroute) * - schema-marker detection: the config schema carrying an `agent_config` - * block (`x-ag-type-ref`/`x-ag-type`) detects as agent even when the harness/ - * sandbox values live nested (not at the top level of `configuration`) — this + * block (`x-ag-type-ref`/`x-ag-type`) detects as agent even when the harness + * value lives nested (not at the top level of `configuration`) — this * is the signal the left-panel AgentConfigControl already dispatches on * - heuristic fallback: until WP-6, a stored config carrying top-level - * `harness`/`sandbox` detects as agent even when workflowType hasn't resolved + * `harness` detects as agent even when workflowType hasn't resolved * to "agent" yet * * The workflow molecule is mocked with writable atoms for `workflowType`, @@ -114,12 +114,12 @@ describe("isAgentModeAtomFamily", () => { expect(store.get(isAgentModeAtomFamily("b"))).toBe(false) }) - it("schema marker: an agent_config property detects as agent even when harness/sandbox are nested", () => { + it("schema marker: an agent_config property detects as agent even when harness is nested", () => { // Real-world shape: backend hasn't set is_agent (rides as `custom`), and - // harness/sandbox live INSIDE the agent_config block — so the top-level + // harness lives INSIDE the agent_config block — so the top-level // configuration heuristic misses. The schema marker is what saves it. setType(store, "ag", "custom") - setConfig(store, "ag", {agent_config: {harness: "pi_core", sandbox: "local"}}) + setConfig(store, "ag", {agent_config: {harness: "pi_core"}}) setSchema(store, "ag", { type: "object", properties: { @@ -144,13 +144,13 @@ describe("isAgentModeAtomFamily", () => { expect(store.get(isAgentModeAtomFamily("noag"))).toBe(false) }) - it("heuristic: a config carrying harness/sandbox detects as agent", () => { + it("heuristic: a config carrying harness detects as agent", () => { setType(store, "h", "custom") // not yet flagged agent - setConfig(store, "h", {harness: "pi_core", sandbox: "local"}) + setConfig(store, "h", {harness: "pi_core"}) expect(store.get(isAgentModeAtomFamily("h"))).toBe(true) }) - it("heuristic does not false-positive without harness/sandbox", () => { + it("heuristic does not false-positive without harness", () => { setType(store, "plain", "completion") setConfig(store, "plain", {temperature: 0.7}) expect(store.get(isAgentModeAtomFamily("plain"))).toBe(false) diff --git a/web/packages/agenta-playground/tests/unit/agentRequest.test.ts b/web/packages/agenta-playground/tests/unit/agentRequest.test.ts index e787018651..78139eb2fc 100644 --- a/web/packages/agenta-playground/tests/unit/agentRequest.test.ts +++ b/web/packages/agenta-playground/tests/unit/agentRequest.test.ts @@ -156,13 +156,21 @@ describe("buildAgentRequest", () => { }) }) - it("defaults harness/sandbox INSIDE the agent block when the config nests under `agent`", async () => { + it("defaults harness INSIDE the agent block; passes an explicit uri through", async () => { // The schema places the run-selection fields on the agent config (`parameters.agent`), - // not as top-level params siblings. Defaults land there; an explicit value wins. - seed(store, "e", {config: {agent: {model: "gpt-5.5", sandbox: "daytona"}}}) + // not as top-level params siblings. `harness` defaults there; the sidecar `uri` is not + // defaulted (unset = the server's env-var routing fallback) but an explicit one passes + // through unchanged. + seed(store, "e", {config: {agent: {model: "gpt-5.5", uri: "http://sidecar:8765"}}}) const req = await buildAgentRequest("e", [], {sessionId: "s1", store}) const agent = (req!.requestBody.data as any).parameters.agent - expect(agent).toMatchObject({model: "gpt-5.5", harness: "pi_core", sandbox: "daytona"}) + expect(agent).toMatchObject({ + model: "gpt-5.5", + harness: "pi_core", + uri: "http://sidecar:8765", + }) + // no sandbox field defaulted (the per-run sandbox selector is gone) + expect(agent.sandbox).toBeUndefined() // not duplicated at the top level expect((req!.requestBody.data as any).parameters.harness).toBeUndefined() })