Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions docs/design/agent-workflows/documentation/ground-truth.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,12 @@ this page and the referenced code as the source of truth.
- Warm daemon sessions, ACP `session/load`, and session fork are not wired.
- `AgentaHarness` ships placeholder Agenta preamble, persona, and skill set. It does run on
sandbox-agent local and Daytona, verified by the QA matrix (`projects/qa/findings.md`, F-002).
- The agent is not registered as a first-class built-in workflow type. The builtin interface
exists in the SDK, but the handler is still bound directly (`services/oss/src/agent/app.py:138`).
- The live agent handler is bound to the builtin URI `agenta:builtin:agent:v0`:
`create_agent_app()` (`services/oss/src/agent/app.py`) registers the instrumented `_agent` and the
service interface under that URI, so `retrieve_handler` / `retrieve_interface` return the live
handler and the same schemas `/inspect` advertises (the interface override is process-local to the
agent service). The harness in the agent_config interface carries a versioned slug + display name
per option (`HARNESS_IDENTITIES`); the stored/wire harness value stays the bare string.
- Per-request model override is not honored on the Pi ACP path. pi-acp accepts only its
default model and silently falls back (`projects/qa/findings.md`, F-007).
- Remote (`http`) MCP servers are skipped by the runner path. Local stdio MCP is the path
Expand Down
2 changes: 1 addition & 1 deletion docs/design/agent-workflows/documentation/protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ Request fields include:

| Field | Meaning |
| --- | --- |
| `harness` | Harness id: `pi_core`, `pi_agenta`, or `claude`. `pi_core` and `pi_agenta` both drive the `pi` ACP agent; `pi_agenta` is Pi with Agenta's forced skills, prompt, and policy. `claude` drives the `claude` ACP agent. |
| `harness` | Harness id, the bare string `pi_core`, `pi_agenta`, or `claude`. `pi_core` and `pi_agenta` both drive the `pi` ACP agent; `pi_agenta` is Pi with Agenta's forced skills, prompt, and policy. `claude` drives the `claude` ACP agent. The wire value is bare; the agent_config *interface* dresses each value with a versioned slug + display name (see [Agent config schema](../interfaces/public-edge/agent-config-schema.md)), but the wire and the runner selector are unchanged. |
| `sandbox` | Sandbox id, usually `local` or `daytona`. |
| `sessionId` | External conversation id. The runtime is cold and receives history in `messages`. |
| `agentsMd` | Instructions that become `AGENTS.md`. |
Expand Down
6 changes: 3 additions & 3 deletions docs/design/agent-workflows/interfaces/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,17 +42,17 @@ page. `Status` is read from each page's prose: **stable** (wired and unlikely to
| Interface | Blast radius | Owner file(s) | Status | Tests |
|---|---|---|---|---|
| [`/invoke`](public-edge/workflow-invoke.md) | public | `decorators/routing.py`, `models/workflows.py`, `agent/app.py` | stable | `unit/agent/`, `utils/test_messages_endpoint.py` |
| [`/inspect`](public-edge/workflow-inspect.md) | public | `agent/schemas.py`, `models/workflows.py`, `decorators/routing.py` | stable | `unit/agents/test_dtos_agent_config.py` |
| [`/inspect`](public-edge/workflow-inspect.md) | public | `agent/schemas.py`, `agent/app.py` (builtin-URI binding), `models/workflows.py`, `decorators/routing.py` | stable | `unit/agents/test_dtos_agent_config.py`, `unit/agent/test_builtin_uri_binding.py` |
| [`/messages`](public-edge/agent-messages.md) | public | `adapters/vercel/{routing,messages,stream}.py`, `agentRequest.ts` | evolving (create-or-resume not observable until storage lands) | `utils/test_messages_endpoint.py`, `unit/agents/test_ui_messages.py` |
| [Agent config schema](public-edge/agent-config-schema.md) | public | `agent/schemas.py`, `sdk/utils/types.py`, `agents/dtos.py` | stable | `unit/agents/test_dtos_agent_config.py` |
| [Agent config schema](public-edge/agent-config-schema.md) | public | `agent/schemas.py`, `sdk/utils/types.py`, `agents/dtos.py` (`HARNESS_IDENTITIES`) | stable | `unit/agents/test_dtos_agent_config.py`, `unit/agents/test_harness_identity.py` |
| [`/run`](cross-service/service-to-agent-runner.md) | cross-service (the spine) | `protocol.ts`, `utils/wire.py`, `utils/ts_runner.py`, `server.ts`/`cli.ts` | stable (pinned by golden) | `unit/agents/test_wire_contract.py` + `golden/`, `services/agent/tests/unit/wire-contract.test.ts` |
| [Runner to harness](cross-service/runner-to-harness.md) | cross-service (ACP) | `engines/sandbox_agent.ts` + `sandbox_agent/{run-plan,capabilities,permissions}.ts` | evolving | `services/agent/tests/unit/sandbox-agent-*.test.ts` |
| [Runner to MCP server](cross-service/runner-to-mcp-server.md) | cross-service | `agents/mcp/`, `engines/sandbox_agent/mcp.ts`, `tools/{mcp-bridge,mcp-server,relay}.ts` | evolving (stdio wired; remote deferred) | `services/agent/tests/unit/mcp-servers.test.ts` |
| [Runner to tool callback](cross-service/runner-to-tool-callback.md) | cross-service | `tools/{callback,dispatch}.ts`, `apis/fastapi/tools/router.py`, `agent/tools/resolver.py` | stable | `services/agent/tests/unit/{code-tool,extension-tools}.test.ts` |
| [Service and runner trace export](cross-service/service-and-runner-trace-export.md) | cross-service | `agent/tracing.py`, `tracing/otel.ts`, `extensions/agenta.ts` | stable | `services/agent/tests/unit/` |
| [Service to vault and tool providers](cross-service/service-to-vault-and-tool-providers.md) | cross-service (external) | `agent/app.py`, `platform/{resolve,connections}.py`, `agents/capabilities.py`, `tools/router.py` | stable | `unit/agents/connections/`, `unit/agents/platform/`, `unit/agents/tools/` |
| [Agent service handler](in-service/agent-service-handler.md) | in-service | `services/oss/src/agent/app.py` | stable | `services/oss/tests/pytest/unit/agent/` |
| [Neutral runtime DTOs](in-service/neutral-runtime-dtos.md) | in-service | `agents/dtos.py` | stable | `unit/agents/test_dtos_*.py` |
| [Neutral runtime DTOs](in-service/neutral-runtime-dtos.md) | in-service | `agents/dtos.py` | stable | `unit/agents/test_dtos_*.py`, `test_harness_identity.py` |
| [Runtime ports](in-service/runtime-ports.md) | in-service | `agents/interfaces.py` | evolving (`LocalBackend` stub) | `unit/agents/test_environment_lifecycle.py`, `test_harness_adapters.py` |
| [Backend adapter](in-service/backend-adapter.md) | in-service | `agents/adapters/sandbox_agent.py` | stable | `unit/agents/test_runner_adapter_config.py`, `test_environment_lifecycle.py` |
| [Harness adapters](in-service/harness-adapters.md) | in-service | `agents/adapters/harnesses.py`, `agents/dtos.py` | stable | `unit/agents/test_harness_adapters.py`, `test_dtos_harness_configs.py` |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,23 @@ The handler (`_agent` in `app.py`) takes the workflow envelope's pieces:
`{"role": "assistant", "content": result.output}`.
9. Record usage.

## App build: binding the builtin URI

`create_agent_app()` binds the handler to the canonical builtin URI `agenta:builtin:agent:v0`
instead of letting it fall to an auto `user:custom:...` URI, so the handler and the interface
`/inspect` advertises share one identity. The order avoids two traps:

1. **Instrument before registering.** `register_handler(auto_instrument(_agent), uri=...)` — not the
raw `_agent`. `ag.workflow` only instruments inside its own `_register_handler`, which it skips
once a handler already exists in the registry, so the service registers the instrumented one.
2. **Override the interface.** `register_interface(...)` REPLACES the SDK's minimal seed for the
URI with the service interface (`AGENT_SCHEMAS`), so `retrieve_interface(uri)` returns what
`/inspect` advertises. This is process-local to the agent service; the API catalog still builds
from the SDK defaults in its own process.

Then `ag.workflow(uri="agenta:builtin:agent:v0", schemas=AGENT_SCHEMAS, meta=...)(_agent)` resolves
the instrumented handler and merges the registered interface (the passed `schemas`/`meta` win).

## Owned by

- `services/oss/src/agent/app.py`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,12 @@ All in `dtos.py`. The ones that carry the most weight:
by harness name. Built by `from_params(...)`. The editable schema is
[Agent config schema](../public-edge/agent-config-schema.md).
- **`RunSelection`**: `harness` (default `pi_core`), `sandbox` (default `local`),
`permission_policy` (`auto` | `deny`).
`permission_policy` (`auto` | `deny`). The `harness` value is the bare `HarnessType` string.
- **`HarnessType` and `HARNESS_IDENTITIES`**: the closed harness enum plus the single source for
each harness's interface identity — a versioned slug (`agenta:harness:<value>:v0`, the repo's
slug grammar) and a display name. The agent_config schema builds its harness `oneOf` from
`HARNESS_IDENTITIES`; the stored/wire value stays the bare enum string, so only the interface
gains the slug + name. See [Agent config schema](../public-edge/agent-config-schema.md).
- **`SessionConfig`**: everything one run needs, assembled by the handler: the agent config,
secrets, resolved connection, permission policy, trace, session id, and the resolved tool
and MCP inputs.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ The fields and the full schema follow.
| `model` | string (`grouped_choice`) | `"gpt-5.5"` | Model the agent runs on. A plain id (`"gpt-5.5"`) or a structured `{provider, connection}` ref. See [Model connection resolution](../in-service/model-connection-resolution.md). |
| `tools` | `ToolConfig[]` | `[]` | Runnable tools: `builtin`, `gateway`, `code`, or `client`. See [Tool models and resolution](../in-service/tool-models-and-resolution.md). |
| `mcp_servers` | `MCPServerConfig[]` | `[]` | Declared MCP servers; secret env resolved from the vault at run time. See [MCP models and resolution](../in-service/mcp-models-and-resolution.md). |
| `harness` | `"pi_core" \| "claude" \| "pi_agenta"` | `"pi_core"` | The coding agent to drive. `pi_core` and `pi_agenta` both drive the `pi` ACP agent; `pi_agenta` adds Agenta's forced skills, prompt, and policy. |
| `harness` | `"pi_core" \| "claude" \| "pi_agenta"` (see slug+name note) | `"pi_core"` | The coding agent to drive. `pi_core` and `pi_agenta` both drive the `pi` ACP agent; `pi_agenta` adds Agenta's forced skills, prompt, and policy. |
| `sandbox` | `"local" \| "daytona"` | `"local"` | Where it runs. |
| `permission_policy` | `"auto" \| "deny"` | `"auto"` | How a gating harness (Claude Code) handles tool-use prompts in a headless run. |
| `sandbox_permission` | `SandboxPermission \| null` | `null` (form pre-fills one) | The declared network and filesystem boundary. See [Sandbox permission](../in-service/sandbox-permission.md). |
Expand All @@ -33,6 +33,35 @@ Note that `harness`, `sandbox`, and `permission_policy` are the run selection. T
reads them from the same `parameters` object via `RunSelection.from_params(...)`, not just
from `AgentConfig`.

### Harness as a slug + display name

The `harness` field's JSON Schema carries both a flat `enum` of the bare values (back-compat
for any consumer that reads `schema.enum`) AND a `oneOf` of per-option entries, each a versioned
**slug** identity plus a **display name**, built from one SDK source
(`HARNESS_IDENTITIES` in `sdks/python/agenta/sdk/agents/dtos.py`). The slug follows the repo's
`agenta:<namespace>:<name>:v<N>` grammar (mirroring `agenta:builtin:agent:v0`), namespace
`harness`:

```jsonc
"harness": {
"type": "string",
"default": "pi_core",
"enum": ["pi_core", "pi_agenta", "claude"],
"oneOf": [
{ "const": "pi_core", "title": "Pi", "x-ag-harness-slug": "agenta:harness:pi_core:v0" },
{ "const": "pi_agenta", "title": "Pi (Agenta)", "x-ag-harness-slug": "agenta:harness:pi_agenta:v0" },
{ "const": "claude", "title": "Claude Code", "x-ag-harness-slug": "agenta:harness:claude:v0" }
]
}
```

The **stored/wire value stays the bare string** (`const`): the runner reads it as the runtime
selector and the frontend keys connection gating off it, so the `/run` wire is unchanged. The
playground `EnumSelectControl` reads the `oneOf` `title` for the dropdown label and writes the
bare `const` back. The slug is the harness contract's versioned identity in the interface only;
versioning the contract (`/run` `version`, the `/health` skew read) is deferred (see the
[contract-versioning project](../../projects/contract-versioning/README.md)).

## The default config

`/inspect` ships this as the value the form starts from. It is the canonical example of
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@ the fields.
## Owned by

- `services/oss/src/agent/schemas.py`: builds the input, parameter, and output schemas.
- `services/oss/src/agent/app.py`: `create_agent_app()` binds the live `_agent` handler AND the
service interface to the builtin URI `agenta:builtin:agent:v0` (via `register_handler` /
`register_interface`), so `retrieve_handler` / `retrieve_interface` return the live handler and
the same schemas `/inspect` advertises. The handler and the interface share one identity.
- `sdks/python/agenta/sdk/models/workflows.py`: the inspect response model.
- `sdks/python/agenta/sdk/decorators/routing.py`: the generic inspect route.

Expand All @@ -53,6 +57,11 @@ the fields.
- **Catalog type markers.** `agent_config` and `messages` bind the schema to a playground
control. Renaming a marker without updating the catalog breaks the form silently.
- **The config default.** `/inspect` ships the default agent config the form starts from.
Keep it in sync with what the runtime actually accepts.
Keep it in sync with what the runtime actually accepts. The SDK builtin config registry entry
(`CONFIGURATION_REGISTRY` for `agent:v0`) uses the same `build_agent_v0_default()` builder, so a
URI-dispatched run with no parameters gets the same default.
- **Harness capability metadata.** The form filters connections from this block. If it drifts
from the server-side table, the form offers choices the run will reject.
- **The builtin URI binding.** The live handler and interface are registered under
`agenta:builtin:agent:v0` at app build time. The interface override is process-local (the agent
service process), so the API process's catalog still builds from the SDK defaults.
Loading
Loading