Skip to content

[docs] Add agent workflow interface inventory#4821

Merged
mmabrouk merged 5 commits into
big-agentsfrom
docs/agent-workflow-interface-inventory
Jun 25, 2026
Merged

[docs] Add agent workflow interface inventory#4821
mmabrouk merged 5 commits into
big-agentsfrom
docs/agent-workflow-interface-inventory

Conversation

@mmabrouk

@mmabrouk mmabrouk commented Jun 24, 2026

Copy link
Copy Markdown
Member

Context

Agent workflow review work currently has to jump between protocol, architecture, runner, and adapter docs to understand which interface a change touches. We need a lightweight map that names those boundaries and points reviewers to the owning files.

Changes

Adds docs/design/agent-workflows/interfaces/ as a review-oriented inventory, organized by blast radius:

  • public edge interfaces for browser and workflow-client contracts
  • cross-service interfaces for service, sidecar, harness, tools, MCP, vault, and trace boundaries
  • in-service interfaces for SDK ports, DTOs, adapters, config, permissions, and model resolution

Each category has its own folder, with one file per interface or closely related interface group.

Scope / risk

Docs-only change. No code, no tests, no migrations. The only thing that could regress is a broken cross-reference, and those were checked during authoring.

Tests / notes

Verified the new links and scanned the new docs for banned docs-style terms and em dashes.

How to QA

Prerequisites: No stack needed. A browser or a Markdown reader is enough.

Steps:

  1. Open docs/design/agent-workflows/interfaces/README.md and verify the three category sections (Public edge, Cross-service, In-service) each link to a populated subfolder.
  2. Click through three or four of the linked files and confirm the interface name, owning file path, and description make sense for a reviewer who doesn't know the PR.
  3. Check that no file contains a broken relative link (GitHub renders these as 404 text in the preview).

Expected result: Every linked file opens and describes a real boundary in the codebase. No 404 links.

Automated tests: None. This is docs-only; there are no automated tests to run.

Edge cases: Confirm the agent-load-session.md file exists (it is referenced from public-edge/README.md); it will be deleted in the follow-on chore PR (#4828).

@vercel

vercel Bot commented Jun 24, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 24, 2026 8:19pm

Request Review

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds an agent-workflows interface inventory, expands the documentation layout into public-edge, cross-service, and in-service contract pages, and updates existing design docs to reference the inventory plus revised /run, skills, resolution, permission, and adapter descriptions.

Changes

Agent workflow interface inventory

Layer / File(s) Summary
Inventory index and layout updates
docs/design/agent-workflows/README.md, docs/design/agent-workflows/documentation/ports-and-adapters.md, docs/design/agent-workflows/interfaces/README.md, docs/design/agent-workflows/interfaces/cross-service/README.md, docs/design/agent-workflows/interfaces/in-service/README.md, docs/design/agent-workflows/interfaces/public-edge/README.md
The agent-workflows overview pages now describe a five-layer layout and point to the new interface inventory sections.
Cross-service contract docs
docs/design/agent-workflows/documentation/protocol.md, docs/design/agent-workflows/interfaces/cross-service/runner-to-harness.md, docs/design/agent-workflows/interfaces/cross-service/runner-to-mcp-server.md, docs/design/agent-workflows/interfaces/cross-service/runner-to-tool-callback.md, docs/design/agent-workflows/interfaces/cross-service/service-and-runner-trace-export.md, docs/design/agent-workflows/interfaces/cross-service/service-to-agent-runner.md, docs/design/agent-workflows/interfaces/cross-service/service-to-vault-and-tool-providers.md
The /run protocol doc and cross-service interface pages define runner, MCP, tool callback, trace export, and vault/provider resolution contracts.
Public-edge API contracts
docs/design/agent-workflows/interfaces/public-edge/agent-config-schema.md, docs/design/agent-workflows/interfaces/public-edge/agent-load-session.md, docs/design/agent-workflows/interfaces/public-edge/agent-messages.md, docs/design/agent-workflows/interfaces/public-edge/workflow-inspect.md, docs/design/agent-workflows/interfaces/public-edge/workflow-invoke.md
The public-edge pages define the agent config schema, load-session, messages, inspect, and invoke request and response contracts.
Runtime DTOs and ports
docs/design/agent-workflows/interfaces/in-service/runtime-ports.md, docs/design/agent-workflows/interfaces/in-service/neutral-runtime-dtos.md, docs/design/agent-workflows/interfaces/in-service/runner-engine-internals.md
The runtime ports, neutral DTOs, and runner-engine internals define the shared in-process types, message normalization, and HTTP or CLI engine behavior.
Connection, tool, and MCP resolution
docs/design/agent-workflows/interfaces/in-service/model-connection-resolution.md, docs/design/agent-workflows/interfaces/in-service/mcp-models-and-resolution.md, docs/design/agent-workflows/interfaces/in-service/tool-models-and-resolution.md
The model connection, MCP, and tool resolution pages define how config inputs become resolved connections, MCP servers, and tool specs.
Permission and sandbox policy
docs/design/agent-workflows/interfaces/in-service/permission-responder.md, docs/design/agent-workflows/interfaces/in-service/sandbox-permission.md
The permission responder and sandbox permission pages define approval precedence, cross-turn matching, and network or filesystem enforcement rules.
Service handler and adapters
docs/design/agent-workflows/interfaces/in-service/agent-service-handler.md, docs/design/agent-workflows/interfaces/in-service/backend-adapter.md, docs/design/agent-workflows/interfaces/in-service/browser-protocol-adapter.md, docs/design/agent-workflows/interfaces/in-service/harness-adapters.md
The service handler and adapter pages define the agent service flow and the backend, browser, and harness adapters used to execute it.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

  • Agenta-AI/agenta#4779: Updates the same agent-workflows /run protocol documentation and field-level contract narrative.
  • Agenta-AI/agenta#4814: Implements the inline skill package shape referenced by the updated skills documentation.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly matches the main change: adding an agent workflow interface inventory in docs.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The description matches the docs-only interface inventory changes and the new public-edge, cross-service, and in-service folders.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/agent-workflow-interface-inventory

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Comment thread docs/design/agent-workflows/projects/qa/scripts/run_matrix.py Dismissed
Comment thread docs/design/agent-workflows/projects/qa/scripts/run_matrix.py Dismissed
| `tools` | `ToolConfig[]` | `[]` | Runnable tools: `builtin`, `gateway`, `code`, or `client`. See [Tool models and resolution](../in-service/tool-models-and-resolution.md). |
| `mcp_servers` | `MCPServerConfig[]` | `[]` | Declared MCP servers; secret env resolved from the vault at run time. See [MCP models and resolution](../in-service/mcp-models-and-resolution.md). |
| `harness` | `"pi" \| "claude" \| "agenta"` | `"pi"` | The coding agent to drive. |
| `sandbox` | `"local" \| "daytona"` | `"local"` | Where it runs. |

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead we should have an optional uri that points to sidecar and provide an address of the thing (the sandbox should probably use this uri to determine where to route the request). if the uri is not set then we use the environment variabels

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted as a design follow-up (added to docs/design/agent-workflows/scratch/implementation-queue.md), not yet scheduled. The idea: an optional uri in the agent config that points at the sidecar and provides its address — the sandbox uses it to route the request, and we fall back to the env vars when it's unset. Routes near the sidecar-deployment / sidecar-trust work. Not editing the schema page now to avoid colliding with the running contract-versioning work on this surface.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented: uri replaces sandbox (allowlist-gated routing) — PR #4836.

| `mcp_servers` | `MCPServerConfig[]` | `[]` | Declared MCP servers; secret env resolved from the vault at run time. See [MCP models and resolution](../in-service/mcp-models-and-resolution.md). |
| `harness` | `"pi" \| "claude" \| "agenta"` | `"pi"` | The coding agent to drive. |
| `sandbox` | `"local" \| "daytona"` | `"local"` | Where it runs. |
| `permission_policy` | `"auto" \| "deny"` | `"auto"` | How a gating harness (Claude Code) handles tool-use prompts in a headless run. |

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is wrong. we were supposed to have a general harness_kwargs where the user can set {"claude":{whatever we want to add to claude) if this is specific.
on the other hand we had permissions that determined how the sidecar delt with certain action. permission_policy what supposed to be that. while harness_kwargs is where the user can set permission or anything else for the different harnesses passed by key

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Routed, and thanks for the correction. The intent: a general per-harness harness_kwargs bag ({"claude": {…}}) where the author sets per-harness options including permissions, while permission_policy is the sidecar action-permission (how the sidecar handles a tool action), not a harness option. That's a design direction overlapping the in-flight contract-versioning / capability-config surface, so it's routed there; the inventory's current permission_policy/Claude-options wording will be corrected once that settles. Not editing the agent-config-schema page now (running work owns it).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented: harness_kwargs per-harness bag (was harness_options); permission_policy stays the sidecar action-permission, on AgentConfig — PR #4840.

| Field | Type | Default | Meaning |
|---|---|---|---|
| `agents_md` | string (textarea) | hello-world prompt | The agent's system prompt, its AGENTS.md. |
| `model` | string (`grouped_choice`) | `"gpt-5.5"` | Model the agent runs on. A plain id (`"gpt-5.5"`) or a structured `{provider, connection}` ref. See [Model connection resolution](../in-service/model-connection-resolution.md). |

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is wrong, we never provide a string here, it is always structured. we should use the ModelRef which would include the provider, model, connectin.. etc..

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Routed. model as a structured ModelRef (provider, model, connection, …) is exactly the direction of the in-flight provider-model-auth / agent-model-picker work (ModelRef already landed via PR #4815; the picker is docs/design/agent-workflows/projects/agent-model-picker/). The inventory still describes model as a bare string for back-compat coercion; that wording will be corrected once that surface settles. Not editing the agent-config-schema page now to avoid colliding with the running contract-versioning work on the same surface.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented: model is always a structured ModelRef now (free-text path dropped) — PR #4839.

|---|---|---|---|
| `agents_md` | string (textarea) | hello-world prompt | The agent's system prompt, its AGENTS.md. |
| `model` | string (`grouped_choice`) | `"gpt-5.5"` | Model the agent runs on. A plain id (`"gpt-5.5"`) or a structured `{provider, connection}` ref. See [Model connection resolution](../in-service/model-connection-resolution.md). |
| `tools` | `ToolConfig[]` | `[]` | Runnable tools: `builtin`, `gateway`, `code`, or `client`. See [Tool models and resolution](../in-service/tool-models-and-resolution.md). |

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also allow here embedref like skills. these would allow creating tools as workflows and embedding them.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted as a design follow-up (added to the implementation queue), not yet scheduled. The idea: allow an EmbedRef in tools (the way skills already accepts SkillConfig | EmbedRef), so a tool can be authored as a workflow and embedded/inlined by the backend before the runner sees it. Overlaps the tool-definition + skills-embed surface. Parking it in the queue for triage rather than editing the inventory now.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Designed: embedref tools-as-workflows — PR #4837 (runnable→callback, non-runnable→client; simplified).

mmabrouk added 5 commits June 24, 2026 21:29
Rewrite sandbox-permission.md with the code-verified enforcement matrix:
network egress IS enforced on Daytona (provider.ts daytonaNetworkFields ->
networkBlockAll/networkAllowList), is NOT enforceable on local (run-plan.ts
rejects restricted+strict), and the filesystem boundary is enforced nowhere;
code/gateway tools and the stdio MCP bridge run on the runner host so they
bypass the sandbox even on Daytona. Drops the stale 'declared, not enforced'
framing.

Align runtime-ports.md and harness-adapters.md to the strong-page template
(cross-link, tests-that-move-with-it). Correct the Claude skills claim: Claude
carries inline skills on the wire and the runner materializes them under
.claude/skills (the old wire_skills()->{} override is gone, pinned by
test_claude_carries_skills_for_project_local_materialization).

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
…apters.md, fix two doc inaccuracies

The inventory re-documented material already in documentation/protocol.md and
documentation/ports-and-adapters.md with no links. Add reciprocal cross-links
framed by ownership: the inventory owns the 'what crosses / what can break /
what to check' review lens; the documentation/ docs own the field-by-field
narrative. Links added on service-to-agent-runner, agent-messages,
workflow-invoke, browser-protocol-adapter, backend-adapter, neutral-runtime-dtos
(and protocol.md / ports-and-adapters.md back).

Fix two clear, code-verified inaccuracies:
- protocol.md said skills are 'bundled directory names to force-load (agenta,
  Pi only)'. They are resolved inline packages (full SKILL.md content, @ag.embed
  inlined server-side) declared in the agent config; all three harnesses wire
  them (Claude inherits the base wire_skills; the runner materializes them under
  .claude/skills). Verified against dtos.py and harness adapter tests.
- ports-and-adapters.md cited a stale line number (app.py:49 is now an import,
  not the SandboxAgentBackend build); drop the brittle :49 suffix.
- Replace the stale 'declared, not enforced' sandbox-permission framing in
  agent-config-schema.md and service-to-agent-runner.md with the real matrix
  (network enforced on Daytona) plus a link to the matrix.

The tool gateway, Composio, connections, and MCP are described as EXISTING and
UNCHANGED by this doc PR.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
One row per interface with Interface | Blast radius | Owner file(s) | Status |
Tests, so a reviewer can find the boundary a diff touches before opening a page.
Status is derived from each page's prose (stable / evolving / declared-not-wired).
The /load-session shell endpoint is intentionally omitted because a sibling
change is removing it.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (2)
docs/design/agent-workflows/interfaces/in-service/model-connection-resolution.md (1)

40-42: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Link the capability matrix instead of repeating counts.

This paragraph is likely to drift as harness support changes. Point readers to the canonical capability table rather than hard-coding provider totals here.

♻️ Proposed wording tweak
- Each harness publishes what it can reach: Pi reaches eight direct providers; Claude reaches Anthropic across direct, custom, bedrock, and vertex. Unnamed default connections degrade tolerantly to an empty env rather than failing the run.
+ Each harness publishes its reachable providers in the capability matrix; link that matrix here instead of repeating the counts. Unnamed default connections degrade tolerantly to an empty env rather than failing the run.
docs/design/agent-workflows/interfaces/README.md (1)

47-47: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Mark the agent-config row as evolving.

The PR notes say agent-config-schema.md is still absorbing contract-versioning follow-ups, so stable overstates how settled this surface is. evolving would better match the inventory's own status definitions.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: de4eea18-84f0-4646-a4d6-db878bd48406

📥 Commits

Reviewing files that changed from the base of the PR and between 9cbcbfd and 90930b9.

📒 Files selected for processing (30)
  • docs/design/agent-workflows/README.md
  • docs/design/agent-workflows/documentation/ports-and-adapters.md
  • docs/design/agent-workflows/documentation/protocol.md
  • docs/design/agent-workflows/interfaces/README.md
  • docs/design/agent-workflows/interfaces/cross-service/README.md
  • docs/design/agent-workflows/interfaces/cross-service/runner-to-harness.md
  • docs/design/agent-workflows/interfaces/cross-service/runner-to-mcp-server.md
  • docs/design/agent-workflows/interfaces/cross-service/runner-to-tool-callback.md
  • docs/design/agent-workflows/interfaces/cross-service/service-and-runner-trace-export.md
  • docs/design/agent-workflows/interfaces/cross-service/service-to-agent-runner.md
  • docs/design/agent-workflows/interfaces/cross-service/service-to-vault-and-tool-providers.md
  • docs/design/agent-workflows/interfaces/in-service/README.md
  • docs/design/agent-workflows/interfaces/in-service/agent-service-handler.md
  • docs/design/agent-workflows/interfaces/in-service/backend-adapter.md
  • docs/design/agent-workflows/interfaces/in-service/browser-protocol-adapter.md
  • docs/design/agent-workflows/interfaces/in-service/harness-adapters.md
  • docs/design/agent-workflows/interfaces/in-service/mcp-models-and-resolution.md
  • docs/design/agent-workflows/interfaces/in-service/model-connection-resolution.md
  • docs/design/agent-workflows/interfaces/in-service/neutral-runtime-dtos.md
  • docs/design/agent-workflows/interfaces/in-service/permission-responder.md
  • docs/design/agent-workflows/interfaces/in-service/runner-engine-internals.md
  • docs/design/agent-workflows/interfaces/in-service/runtime-ports.md
  • docs/design/agent-workflows/interfaces/in-service/sandbox-permission.md
  • docs/design/agent-workflows/interfaces/in-service/tool-models-and-resolution.md
  • docs/design/agent-workflows/interfaces/public-edge/README.md
  • docs/design/agent-workflows/interfaces/public-edge/agent-config-schema.md
  • docs/design/agent-workflows/interfaces/public-edge/agent-load-session.md
  • docs/design/agent-workflows/interfaces/public-edge/agent-messages.md
  • docs/design/agent-workflows/interfaces/public-edge/workflow-inspect.md
  • docs/design/agent-workflows/interfaces/public-edge/workflow-invoke.md

Comment on lines +14 to +20
"data": {
"id": "call_zEoV...", // the LLM tool_call_id, echoed back for correlation
"type": "function",
"function": {
"name": "tools.composio.github.create_issue.my_conn", // resolved tool slug
"arguments": { "title": "..." } // sent as an object, not a JSON string
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win

Don’t narrow the callback arguments contract.

ToolCallFunction.arguments can still be a JSON string at the boundary, so the example should not claim object-only input.

Suggested wording
-      "arguments": { "title": "..." }      // sent as an object, not a JSON string
+      "arguments": { "title": "..." }      // JSON string or object; router normalizes it
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"data": {
"id": "call_zEoV...", // the LLM tool_call_id, echoed back for correlation
"type": "function",
"function": {
"name": "tools.composio.github.create_issue.my_conn", // resolved tool slug
"arguments": { "title": "..." } // sent as an object, not a JSON string
}
"data": {
"id": "call_zEoV...", // the LLM tool_call_id, echoed back for correlation
"type": "function",
"function": {
"name": "tools.composio.github.create_issue.my_conn", // resolved tool slug
"arguments": { "title": "..." } // JSON string or object; router normalizes it
}

Comment on lines +68 to +70
// tracing (see service-and-runner-trace-export.md)
"trace": { "traceparent": "...", "endpoint": "...", "authorization": "...", "captureContent": true }
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

git ls-files 'docs/design/agent-workflows/interfaces/cross-service/*' | sed 's#^`#FILE` #'
printf '\n--- baggage search ---\n'
rg -n --no-heading '\bbaggage\b|traceparent|captureContent|authorization' docs/design/agent-workflows/interfaces/cross-service
printf '\n--- trace-related file contents (selected) ---\n'
for f in \
  docs/design/agent-workflows/interfaces/cross-service/service-to-agent-runner.md \
  docs/design/agent-workflows/interfaces/cross-service/service-and-runner-trace-export.md
do
  echo "FILE: $f"
  wc -l "$f"
done

Repository: Agenta-AI/agenta

Length of output: 2976


🏁 Script executed:

#!/bin/bash
set -euo pipefail

git ls-files 'docs/design/agent-workflows/interfaces/cross-service/*' | sed 's#^`#FILE` #'
printf '\n--- baggage search ---\n'
rg -n --no-heading '\bbaggage\b|traceparent|captureContent|authorization' docs/design/agent-workflows/interfaces/cross-service
printf '\n--- trace-related file contents (selected) ---\n'
for f in \
  docs/design/agent-workflows/interfaces/cross-service/service-to-agent-runner.md \
  docs/design/agent-workflows/interfaces/cross-service/service-and-runner-trace-export.md
do
  echo "FILE: $f"
  wc -l "$f"
done

Repository: Agenta-AI/agenta

Length of output: 2976


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '48,78p' docs/design/agent-workflows/interfaces/cross-service/service-to-agent-runner.md
printf '\n---\n'
sed -n '1,28p' docs/design/agent-workflows/interfaces/cross-service/service-and-runner-trace-export.md
printf '\n---\n'
sed -n '28,60p' docs/design/agent-workflows/interfaces/cross-service/service-and-runner-trace-export.md

Repository: Agenta-AI/agenta

Length of output: 4460


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '48,78p' docs/design/agent-workflows/interfaces/cross-service/service-to-agent-runner.md
printf '\n---\n'
sed -n '1,28p' docs/design/agent-workflows/interfaces/cross-service/service-and-runner-trace-export.md
printf '\n---\n'
sed -n '28,60p' docs/design/agent-workflows/interfaces/cross-service/service-and-runner-trace-export.md

Repository: Agenta-AI/agenta

Length of output: 4460


Keep the /run trace block aligned. service-and-runner-trace-export.md includes baggage in the request contract, but this schema omits it. Add it here or remove it from the other doc so the wire shape stays consistent.

Comment on lines +42 to +45
A `Message` is `{role, content}`, where `content` is a plain string or a list of
`ContentBlock`. A bare string normalizes to one `text` block; a list of all-text blocks
collapses back to a string. The Python model is snake_case, the wire (`to_wire`) is
camelCase, and `from_raw` accepts either, so inbound coercion is forgiving.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win

Don’t say all-text lists collapse back to a string.

dtos.py preserves list-shaped content; to_wire() emits a list and from_raw() only coerces each element. If that normalization is intended, the implementation needs to change too; otherwise trim this sentence.

♻️ Proposed doc fix
- A bare string normalizes to one text block; a list of all-text blocks collapses back to a string.
+ A bare string normalizes to one text block; a list stays a list of `ContentBlock`s.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
A `Message` is `{role, content}`, where `content` is a plain string or a list of
`ContentBlock`. A bare string normalizes to one `text` block; a list of all-text blocks
collapses back to a string. The Python model is snake_case, the wire (`to_wire`) is
camelCase, and `from_raw` accepts either, so inbound coercion is forgiving.
A `Message` is `{role, content}`, where `content` is a plain string or a list of
`ContentBlock`. A bare string normalizes to one `text` block; a list stays a list of
`ContentBlock`s. The Python model is snake_case, the wire (`to_wire`) is
camelCase, and `from_raw` accepts either, so inbound coercion is forgiving.

Comment on lines +25 to +28
**Decision extraction.** `extractApprovalDecisions(request)` scans the message history for
`tool_result` blocks whose output is `{approved: boolean}`, and indexes each by both its
`toolCallId` and its `toolName`. The tool name is the fallback because a cold replay can mint
a fresh permission id each turn, so the stable anchor is the name.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Clarify the cross-turn fallback order.

The implementation prefers toolCallId first and only falls back to toolName when the id is missing, so calling the name “the stable anchor” is backwards and may mislead future changes.

🛠️ Proposed wording fix
- The tool name is the fallback because a cold replay can mint a fresh permission id each turn, so the stable anchor is the name.
+ `toolCallId` is the primary match; `toolName` is only the fallback when the id is missing or rewritten on replay.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
**Decision extraction.** `extractApprovalDecisions(request)` scans the message history for
`tool_result` blocks whose output is `{approved: boolean}`, and indexes each by both its
`toolCallId` and its `toolName`. The tool name is the fallback because a cold replay can mint
a fresh permission id each turn, so the stable anchor is the name.
**Decision extraction.** `extractApprovalDecisions(request)` scans the message history for
`tool_result` blocks whose output is `{approved: boolean}`, and indexes each by both its
`toolCallId` and its `toolName`. `toolCallId` is the primary match; `toolName` is only the fallback when the id is missing or rewritten on replay.

Comment on lines +33 to +36
The route negotiates transport from the `Accept` header. `text/event-stream` returns a
Vercel UI Message Stream framed as SSE. Anything else returns a `WorkflowBatchResponse`
with one assistant message in `data.outputs`. The browser always sends
`Accept: text/event-stream`.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the browser/chat clients always send Accept: text/event-stream.
rg -nP 'Accept.*text/event-stream|text/event-stream|application/x-ndjson|application/jsonl' \
  web/packages/agenta-playground/src/state/execution/agentRequest.ts \
  web/oss/src/components/AgentChatSlice/AgentChatPanel.tsx \
  sdks/python/agenta/sdk/decorators/routing.py -C 3

Repository: Agenta-AI/agenta

Length of output: 7088


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== agentRequest.ts context =="
sed -n '220,250p' web/packages/agenta-playground/src/state/execution/agentRequest.ts

echo
echo "== executionHeadersAtom definition and uses =="
rg -n "executionHeadersAtom|headersFactory|Accept:" web/packages/agenta-playground/src -C 2

Repository: Agenta-AI/agenta

Length of output: 7219


Default the Accept header before merging overrides

web/packages/agenta-playground/src/state/execution/agentRequest.ts sets Accept: "text/event-stream" by default, but headersFactory is spread afterward and can override it. Either move the default after the merge or soften the docs from “always sends” to “sends by default.”

"data": {
"revision": {
"data": {
"uri": "agenta:builtin:agent:v0",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Don't pin the builtin URI yet.

services/oss/src/agent/app.py still says the handler is registered directly and gets an auto user:custom:... URI today; agenta:builtin:agent:v0 is still the target state. Showing the builtin value here makes the inspect contract look live before that binding lands.

Comment on lines +17 to +19
- the turn history from `data.inputs.messages` (or `data.messages`),
- the agent config and run selection from `data.parameters`,
- the trace and reference context from the envelope itself.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Resolve the message-history source ordering.

This page says history comes from data.inputs.messages here, but later says data.messages is primary and inputs.messages is fallback. That leaves the /invoke contract ambiguous, and it also conflicts with docs/design/agent-workflows/documentation/protocol.md. Pick one ordering and keep both docs aligned.

Also applies to: 38-40

mmabrouk added a commit that referenced this pull request Jun 24, 2026
Spin off a design workspace for remote/HTTP MCP support, requested in PR #4821
review (comments 3470094826 + 3469961290). The SDK already models
transport:http + url + named secrets and serializes them to the /run wire; the
deferral is a single skip in the runner's toAcpMcpServers. Documents the
secret-to-header injection path (reusing the existing named-secret resolution,
no new vault route), why HTTP is plausibly simpler than stdio, and how it fits
the MCP resolver and wire.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 24, 2026
Design workspace for an optional `uri` on the agent run config that names the
sidecar (agent runner) address. When set, the service routes `/run` there; when
unset, it falls back to AGENTA_AGENT_RUNNER_URL / the local runner. Spun out of
PR #4821 review comment 3469613625. Design only, no code.

Key decisions: `uri` is a RunSelection field (where a run goes, like `sandbox`),
not the neutral AgentConfig, and not a `/run` wire field (consumed service-side
in select_backend; golden fixtures untouched). A caller-supplied address is
gated by a server-side allowlist, default-off, because the service ships
resolved secrets + bearer tokens to whatever URL it picks.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
Design workspace for an optional `uri` on the agent run config that names the
sidecar (agent runner) address. When set, the service routes `/run` there; when
unset, it falls back to AGENTA_AGENT_RUNNER_URL / the local runner. Spun out of
PR #4821 review comment 3469613625. Design only, no code.

Key decisions: `uri` is a RunSelection field (where a run goes, like `sandbox`),
not the neutral AgentConfig, and not a `/run` wire field (consumed service-side
in select_backend; golden fixtures untouched). A caller-supplied address is
gated by a server-side allowlist, default-off, because the service ships
resolved secrets + bearer tokens to whatever URL it picks.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
Spin off a design workspace for remote/HTTP MCP support, requested in PR #4821
review (comments 3470094826 + 3469961290). The SDK already models
transport:http + url + named secrets and serializes them to the /run wire; the
deferral is a single skip in the runner's toAcpMcpServers. Documents the
secret-to-header injection path (reusing the existing named-secret resolution,
no new vault route), why HTTP is plausibly simpler than stdio, and how it fits
the MCP resolver and wire.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
…ess_kwargs

Move harness/sandbox/permission_policy from the separate RunSelection DTO into
AgentConfig (one agent definition, under data.parameters.agent). Retire RunSelection.
Rename the per-harness options bag harness_options -> harness_kwargs. The /run wire is
unchanged (golden fixtures untouched). PR #4821 review comments 2/7/8.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
Design workspace for an optional `uri` on the agent run config that names the
sidecar (agent runner) address. When set, the service routes `/run` there; when
unset, it falls back to AGENTA_AGENT_RUNNER_URL / the local runner. Spun out of
PR #4821 review comment 3469613625. Design only, no code.

Key decisions: `uri` is a RunSelection field (where a run goes, like `sandbox`),
not the neutral AgentConfig, and not a `/run` wire field (consumed service-side
in select_backend; golden fixtures untouched). A caller-supplied address is
gated by a server-side allowlist, default-off, because the service ships
resolved secrets + bearer tokens to whatever URL it picks.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
Spin off a design workspace for remote/HTTP MCP support, requested in PR #4821
review (comments 3470094826 + 3469961290). The SDK already models
transport:http + url + named secrets and serializes them to the /run wire; the
deferral is a single skip in the runner's toAcpMcpServers. Documents the
secret-to-header injection path (reusing the existing named-secret resolution,
no new vault route), why HTTP is plausibly simpler than stdio, and how it fits
the MCP resolver and wire.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
…ess_kwargs

Move harness/sandbox/permission_policy from the separate RunSelection DTO into
AgentConfig (one agent definition, under data.parameters.agent). Retire RunSelection.
Rename the per-harness options bag harness_options -> harness_kwargs. The /run wire is
unchanged (golden fixtures untouched). PR #4821 review comments 2/7/8.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
Design workspace for an optional `uri` on the agent run config that names the
sidecar (agent runner) address. When set, the service routes `/run` there; when
unset, it falls back to AGENTA_AGENT_RUNNER_URL / the local runner. Spun out of
PR #4821 review comment 3469613625. Design only, no code.

Key decisions: `uri` is a RunSelection field (where a run goes, like `sandbox`),
not the neutral AgentConfig, and not a `/run` wire field (consumed service-side
in select_backend; golden fixtures untouched). A caller-supplied address is
gated by a server-side allowlist, default-off, because the service ships
resolved secrets + bearer tokens to whatever URL it picks.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
Spin off a design workspace for remote/HTTP MCP support, requested in PR #4821
review (comments 3470094826 + 3469961290). The SDK already models
transport:http + url + named secrets and serializes them to the /run wire; the
deferral is a single skip in the runner's toAcpMcpServers. Documents the
secret-to-header injection path (reusing the existing named-secret resolution,
no new vault route), why HTTP is plausibly simpler than stdio, and how it fits
the MCP resolver and wire.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
…ess_kwargs

Move harness/sandbox/permission_policy from the separate RunSelection DTO into
AgentConfig (one agent definition, under data.parameters.agent). Retire RunSelection.
Rename the per-harness options bag harness_options -> harness_kwargs. The /run wire is
unchanged (golden fixtures untouched). PR #4821 review comments 2/7/8.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
Design workspace for an optional `uri` on the agent run config that names the
sidecar (agent runner) address. When set, the service routes `/run` there; when
unset, it falls back to AGENTA_AGENT_RUNNER_URL / the local runner. Spun out of
PR #4821 review comment 3469613625. Design only, no code.

Key decisions: `uri` is a RunSelection field (where a run goes, like `sandbox`),
not the neutral AgentConfig, and not a `/run` wire field (consumed service-side
in select_backend; golden fixtures untouched). A caller-supplied address is
gated by a server-side allowlist, default-off, because the service ships
resolved secrets + bearer tokens to whatever URL it picks.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
@mmabrouk mmabrouk marked this pull request as ready for review June 25, 2026 10:53
@mmabrouk mmabrouk merged commit 80460f1 into big-agents Jun 25, 2026
6 checks passed
@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. documentation Improvements or additions to documentation labels Jun 25, 2026
mmabrouk added a commit that referenced this pull request Jun 25, 2026
…ess_kwargs

Move harness/sandbox/permission_policy from the separate RunSelection DTO into
AgentConfig (one agent definition, under data.parameters.agent). Retire RunSelection.
Rename the per-harness options bag harness_options -> harness_kwargs. The /run wire is
unchanged (golden fixtures untouched). PR #4821 review comments 2/7/8.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
Design workspace for an optional `uri` on the agent run config that names the
sidecar (agent runner) address. When set, the service routes `/run` there; when
unset, it falls back to AGENTA_AGENT_RUNNER_URL / the local runner. Spun out of
PR #4821 review comment 3469613625. Design only, no code.

Key decisions: `uri` is a RunSelection field (where a run goes, like `sandbox`),
not the neutral AgentConfig, and not a `/run` wire field (consumed service-side
in select_backend; golden fixtures untouched). A caller-supplied address is
gated by a server-side allowlist, default-off, because the service ships
resolved secrets + bearer tokens to whatever URL it picks.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
Spin off a design workspace for remote/HTTP MCP support, requested in PR #4821
review (comments 3470094826 + 3469961290). The SDK already models
transport:http + url + named secrets and serializes them to the /run wire; the
deferral is a single skip in the runner's toAcpMcpServers. Documents the
secret-to-header injection path (reusing the existing named-secret resolution,
no new vault route), why HTTP is plausibly simpler than stdio, and how it fits
the MCP resolver and wire.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 25, 2026
Spin off a design workspace for remote/HTTP MCP support, requested in PR #4821
review (comments 3470094826 + 3469961290). The SDK already models
transport:http + url + named secrets and serializes them to the /run wire; the
deferral is a single skip in the runner's toAcpMcpServers. Documents the
secret-to-header injection path (reusing the existing named-secret resolution,
no new vault route), why HTTP is plausibly simpler than stdio, and how it fits
the MCP resolver and wire.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 26, 2026
Add a plan-feature workspace under docs/design/agent-workflows/projects/embedref-tools/
for allowing an @ag.embed reference in the agent config tools field, the way skills
already supports it. Researches the generic embed resolver, the tool taxonomy, and the
_agenta.* platform catalog; proposes a callback-executor workflow tool variant plus an
embed-as-content stepping stone. Design only, spun from PR #4821 review comment 3469653315.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 27, 2026
Add a plan-feature workspace under docs/design/agent-workflows/projects/embedref-tools/
for allowing an @ag.embed reference in the agent config tools field, the way skills
already supports it. Researches the generic embed resolver, the tool taxonomy, and the
_agenta.* platform catalog; proposes a callback-executor workflow tool variant plus an
embed-as-content stepping stone. Design only, spun from PR #4821 review comment 3469653315.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
mmabrouk added a commit that referenced this pull request Jun 27, 2026
Add a plan-feature workspace under docs/design/agent-workflows/projects/embedref-tools/
for allowing an @ag.embed reference in the agent config tools field, the way skills
already supports it. Researches the generic embed resolver, the tool taxonomy, and the
_agenta.* platform catalog; proposes a callback-executor workflow tool variant plus an
embed-as-content stepping stone. Design only, spun from PR #4821 review comment 3469653315.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants