diff --git a/docs/design/agent-workflows/projects/embedref-tools/README.md b/docs/design/agent-workflows/projects/embedref-tools/README.md new file mode 100644 index 0000000000..e53e6d33ec --- /dev/null +++ b/docs/design/agent-workflows/projects/embedref-tools/README.md @@ -0,0 +1,62 @@ +# EmbedRef tools (tools-as-workflows) + +Index for the design workspace that lets the agent config `tools` field point at a **workflow**, +the same way `skills` already does. A tool is just a workflow — any workflow (agent, completion, +channel, chain) can be used as a tool. + +The author picks one of **two syntaxes**, and the syntax decides the behavior: + +- **`@ag.reference`** (new) — keep the reference in the config; the workflow stays a *reference* + because you want to **call** it. At tool-resolution time it becomes a server-side `callback` + call spec (the service runs the workflow revision, like a gateway tool). +- **`@ag.embed`** (existing) — resolve the reference **to its value** and inline it. For a tool + this inlines a concrete `client` tool config; at tool-resolution time it becomes a `client` + spec (fulfilled in the browser). + +The generic resolver does not know about tools. It only knows the two syntaxes (inline-the-value +vs leave-the-reference). The tool-specific logic — turn a kept reference into a callback spec, +turn an embedded value into a client spec — lives in `resolve_tools`. + +Spun out of PR #4821 review comment +[3469653315](https://github.com/Agenta-AI/agenta/pull/4821#discussion_r3469653315) on +`interfaces/public-edge/agent-config-schema.md`: *"we should also allow here embedref like +skills. these would allow creating tools as workflows and embedding them."* + +This is a **design-only** workspace. No code is changed by this PR. It is POC / +pre-production: no back-compat is required. + +## Files + +- [context.md](context.md) — why this exists, goals, non-goals, the reviewer's ask, and the + two syntaxes (embed vs reference) and what each does. +- [research.md](research.md) — how `skills` embedding works today (the generic `@ag.embed` + resolver, the `ResolverMiddleware`), the tool taxonomy (type/executor model), and the exact + seams to mirror, with file paths. The load-bearing finding: the resolver is already generic; + it walks `tools[]` and handles embeds, and a second `@ag.reference` syntax stays just as + generic (leave-the-reference). +- [plan.md](plan.md) — the design: the two-syntax model, the schema arms, the `resolve_tools` + branch (kept reference → callback spec / embedded value → client spec), the server-side + execute endpoint, the wire, tests, and rollout. Explicitly drops the old Option A/B split, the + `workflow` tool variant, and platform-tools-as-workflows. +- [status.md](status.md) — current state, the settled design, and the remaining open + questions. + +## One-paragraph answer to the reviewer + +Yes, and the referencing half reuses the generic resolver. There are **two syntaxes** an author +can put inside `tools[i]`: `@ag.embed` (existing — the resolver inlines the referenced value) +and `@ag.reference` (new — the resolver leaves the reference in place). The `ResolverMiddleware` +and the API resolver stay generic: they only know "inline this value" vs "leave this reference," +nothing about tools. The tool-specific logic lives in **`resolve_tools`**, which runs after the +config is parsed: a **kept `@ag.reference`** becomes a `callback` call spec (a `CallbackToolSpec` +whose `call_ref` encodes the workflow identity; the service runs the referenced workflow revision +server-side, like a gateway tool — no new runner `kind`); an **`@ag.embed`** value that resolved +to a concrete `client` tool config becomes a `client` spec (fulfilled in the browser). The +author's syntax choice (reference vs embed) maps to runnable-vs-not: you *reference* a runnable +workflow because you want to call it; you *embed* a non-runnable (client) tool because it is a +value. The real new code is: (1) two embed/reference arms on the strict `AgentConfigSchema.tools` +(mirroring `_SkillEmbedRefSchema`); (2) a `resolve_tools` branch that builds a callback spec from +a kept reference; (3) a server-side execute endpoint that invokes a referenced workflow revision. +There is no `workflow` tool variant (a tool is just a workflow; any type qualifies), and platform +tools stay in the existing tools endpoints, not the workflow catalog. See +[the design](status.md#design) for the details. diff --git a/docs/design/agent-workflows/projects/embedref-tools/context.md b/docs/design/agent-workflows/projects/embedref-tools/context.md new file mode 100644 index 0000000000..a646190e96 --- /dev/null +++ b/docs/design/agent-workflows/projects/embedref-tools/context.md @@ -0,0 +1,97 @@ +# Context + +## Why this exists + +The agent config has two list fields that an author commits: `tools` and `skills`. They are +not symmetric today. + +- **`skills`** accepts `(SkillConfig | EmbedRef)[]`. An author can write a skill inline as a + `SkillConfig`, OR drop an `@ag.embed` reference to a workflow and the backend inlines that + workflow's content into a concrete `SkillConfig` before the runner sees it. The default + config ships exactly such an embed (the `_agenta.agenta-getting-started` platform skill). + A skill is always passive content, so embedding (inline the value) is the only mode it needs. +- **`tools`** accepts only the four concrete variants `ToolConfig = builtin | gateway | code + | client`. There is no embed/reference arm, so a tool cannot be authored as a workflow and + reused by pointing at it. + +PR #4821 review comment +[3469653315](https://github.com/Agenta-AI/agenta/pull/4821#discussion_r3469653315) asks to +close that gap: + +> we should also allow here embedref like skills. these would allow creating tools as +> workflows and embedding them. + +This unlocks **tools-as-workflows**: a tool is just a workflow (with its own versioning, +history, and editing surface), referenced from any agent config. The agent author does not +re-declare the tool's body; they point at it. **Any** workflow qualifies — agent, completion, +channel, chain — there is no special "tool workflow" type. + +## Goals + +- Make `tools` accept a workflow via the **same two syntaxes** skills can use plus one more: + `@ag.embed` (inline the value) and a new `@ag.reference` (keep the reference). Mirror the + `skills` schema shape for the embed arm and add a reference arm. +- Define the model: **the author's syntax decides the behavior.** `@ag.reference` → a kept + reference → a server-side `callback` call spec (the service runs the referenced workflow + revision, like gateway). `@ag.embed` → an inlined value → a `client` spec. +- Keep the **generic resolver tool-agnostic.** It only does "inline the value" (embed) vs + "leave the reference" (reference). It learns nothing about tools. +- Put the **tool-specific logic in `resolve_tools`**: a kept reference becomes a callback spec, + an embedded value becomes a client spec. +- Keep the runner free of a new `kind`: a reference rides as a `callback` spec, an embed as a + `client` spec. +- Keep secrets and connection auth server-side for the reference (callback) case, the same + safety property gateway tools have. + +## Non-goals + +- Back-compat. This is POC / pre-production; we may change the union and the wire freely. +- A `workflow` tool variant. A referenced workflow is just a workflow; no new tool type in the + discriminated union. +- **Platform tools as workflows.** Platform tools belong in the **existing tools endpoints** + (the same place gateway tools are added), not in the workflow catalog. The `_agenta.*` + tool-workflow / catalog-validation direction is dropped from this design. +- The `is_tool` flag. It is a later, FE-only display hint so referenced workflows surface in + the tool picker; it is noted, not designed here. +- Building the workflow-authoring UI for tools. This design assumes a workflow revision exists; + producing it is a separate surface. +- Changing the generic resolver's contract. It already inlines `@ag.embed` and walks `tools[]`; + the only addition is teaching it to **leave** an `@ag.reference` in place (a "leave it" + branch, not tool-aware logic). It does not gain any tool knowledge. +- A new vault or connection concept. Referenced (callback) workflow tools reuse the existing + named-secret and connection resolution. +- MCP. `mcp_servers` is a sibling field with its own deferral; out of scope here. + +## The reviewer's ask, restated + +Add an embed/reference arm to `tools` so a tool can be created as a workflow and pointed at. Two +mechanisms must meet: the **pointing** mechanism (the resolver, generic, now with two syntaxes) +and the **tool-ness** mechanism (in `resolve_tools`: the pointed-at workflow has to end up as a +tool the agent can call or a client tool the browser fulfills). + +## The two syntaxes: embed vs reference + +A skill is always passive content (markdown + files; the model reads it, nothing executes), so +it only ever needs **embedding** — inline the value. A tool is not uniform: it can be a runnable +workflow you want to **call**, or a non-runnable client tool that is just a **value**. So tools +need both syntaxes, and **the syntax the author writes decides the behavior**: + +- **`@ag.reference`** (new) — the resolver **leaves the reference in the config**. You reference + a workflow *because you want to call it* (a completion, an agent, a channel, a chain — + anything the platform can run). `resolve_tools` turns the kept reference into a + `CallbackToolSpec`: when the model calls the tool, the call routes server-side and Agenta + **invokes the workflow revision**, exactly like a gateway tool — the sidecar/runner relays the + call back, the service runs it, the result returns to the model. Execution and any + connections/secrets stay server-side. This rides on the existing `callback` executor — **no + new runner `kind`**. + +- **`@ag.embed`** (existing) — the resolver **inlines the value**. You embed when the referenced + thing is a non-runnable client tool: there is nothing to call server-side, so the resolver + resolves the reference into its value (**a concrete `client` tool config**). `resolve_tools` + sees that concrete config and produces a `client` spec, and at run time the model's call is + fulfilled client-side next turn — the existing `client` path. + +So **the syntax decides the behavior**, and the choice is made by the author at config time, not +inferred server-side. The decision boundary is clean: the **generic resolver** does inline-vs-leave; +**`resolve_tools`** does the tool-specific mapping (kept reference → callback spec; embedded value +→ client spec). See [plan.md](plan.md) for the concrete shape. diff --git a/docs/design/agent-workflows/projects/embedref-tools/plan.md b/docs/design/agent-workflows/projects/embedref-tools/plan.md new file mode 100644 index 0000000000..e68475fcb2 --- /dev/null +++ b/docs/design/agent-workflows/projects/embedref-tools/plan.md @@ -0,0 +1,187 @@ +# Plan + +Let the agent config `tools` field point at a **workflow** via one of two syntaxes — embed or +reference — so any workflow can be used as a tool. POC / pre-production: no back-compat. + +## The model: two syntaxes, the syntax decides the behavior + +The old plan had two competing options (embed-as-content vs reference-as-tool) and a special +`workflow` tool variant. That was over-built. An earlier revision then said "the resolver inlines +everything and the runnable/not decision is made server-side in a resolve step." Per the author's +review ([3473648119](https://github.com/Agenta-AI/agenta/pull/4837#discussion_r3473648119)) that +is also not the right shape: don't infer the behavior server-side by inspecting the target. Use +**two syntaxes** and let the author's choice decide. + +**A tool is just a workflow.** You point `tools[i]` at a workflow with one of two markers: + +- **`@ag.reference`** (new) — keep the reference. You reference a workflow *because you want to + call it* (a completion, an agent, a channel, a chain — anything the platform can run). The + generic resolver **leaves the reference in the config** (it does not inline it). `resolve_tools` + later turns the kept reference into a `CallbackToolSpec`: the model's call routes server-side + and Agenta **runs the workflow revision**, exactly like a gateway tool. The sidecar relays the + call back; the service invokes; the result returns to the model. Secrets and connections the + workflow needs stay server-side. +- **`@ag.embed`** (existing) — inline the value. You embed when the referenced thing is a + non-runnable client tool: there is nothing to call server-side. The generic resolver + **resolves the reference into its value** — a concrete `client` tool config (name, description, + input schema). `resolve_tools` sees that concrete config and produces a `client` spec; at run + time the model's call is fulfilled client-side next turn, the existing `client` path. + +So **the syntax decides the behavior**: `@ag.reference` → server-side callback execute; +`@ag.embed` → inline-to-value + the existing client handling. The author makes this choice at +config-authoring time. It maps to runnable-vs-not (reference a runnable workflow you want to call; +embed a non-runnable client tool that is a value), but the design does **not** inspect the target +to decide — the marker is authoritative. + +### The decision boundary: generic resolver vs `resolve_tools` + +The clean separation the author asked for: + +- The **generic resolver** (SDK `ResolverMiddleware` + the API embed resolver) knows only two + operations and **nothing about tools**: inline the value (`@ag.embed`) or leave the reference + (`@ag.reference`). It is the same recursive walker that already handles skills; it gains one + "leave it" branch for the reference syntax. +- **`resolve_tools`** (where tool configs are already partitioned by type) owns all + tool-specific logic: a kept `@ag.reference` becomes a `CallbackToolSpec` + the shared + `ToolCallback`; an `@ag.embed`-resolved concrete `client` config becomes a `client` spec. + +This keeps embedding/referencing reusable for any field (skills, tools, future fields) while the +"these are tools" knowledge stays in one place. + +### Any workflow qualifies — there is no "tool workflow" type + +An invocable tool *is* a workflow. There is no need for a workflow specially marked as a tool. +Any workflow type — agent, completion, channel, chain — can be referenced as a tool. We do +**not** add a `WorkflowToolConfig` variant. + +Later (note only, out of scope here): add an `is_tool` flag on a workflow purely so the +frontend can list it in the tool picker. It is a display hint; it changes no runtime behavior. + +### One unifying rule for the sidecar + +Point at everything as a tool. **In the sidecar, if the entry is a kept reference, run it (the +callback executes the referenced workflow); if it is an embedded value, it is a concrete `client` +tool config and the model is fulfilled the client way.** That single rule covers both cases +without branching the wire by tool kind: + +- `@ag.reference` → the callback executes the workflow and returns the result; +- `@ag.embed` → the inlined `client` config rides as a `client` spec and is fulfilled in the + browser. + +## What each syntax produces + +The resolver is **already generic** and already walks `tools[]` (see [research.md](research.md)) +— this is the one genuinely-useful research finding and it still holds. Embed resolution runs in +the SDK `ResolverMiddleware`, which today inlines every `@ag.embed`. The one resolver addition is +a "leave it" branch so an `@ag.reference` passes through untouched. `resolve_tools` then maps each +form: + +- **`@ag.reference`** → a kept reference. The config carries a workflow reference (slug, optional + version) plus the model-facing surface (name, description, input schema). `resolve_tools` turns + it into a `CallbackToolSpec` whose `call_ref` encodes the workflow identity, plus the shared + `ToolCallback` pointing at a server-side execute target. The runner needs **no new `kind`** — + `callback` already dispatches everywhere (direct, Daytona relay, Pi native, the Claude + `agenta-tools` bridge). +- **`@ag.embed`** → an inlined value. The resolver resolves the reference into a concrete + `client` tool config (name, description, input schema) *before* `resolve_tools` runs, so + `resolve_tools` sees a plain `ClientToolConfig` and produces a `client` spec. At run time it is + the existing `client` path: the runner returns a `client` spec, the browser fulfills it next + turn. No callback, no server-side execute. + +Where the tool-specific decision is made: in **`resolve_tools`**, which already partitions tool +configs by type. The kept `@ag.reference` is the only new arm it has to recognize; the embed case +arrives as an already-concrete `client` config. + +## Resolution path, end to end + +```text +author commits agent config; tools[i] is @ag.embed OR @ag.reference + | +SDK ResolverMiddleware: _has_embed_markers(parameters) true (walks lists) + | POST {api}/workflows/revisions/resolve +API generic resolver: + |-- @ag.embed -> inline the referenced value into tools[i] + | (a concrete `client` tool config) + '-- @ag.reference -> LEAVE the reference in tools[i] (do not inline) + v +_agent: AgentConfig.from_params(...) parses tools[i] + | (a kept @ag.reference is a reference arm; an embedded value is a ClientToolConfig) + | +resolve_tools(agent_config.tools): tool-specific mapping + |-- kept @ag.reference -> CallbackToolSpec(call_ref="workflow.") + | + the shared ToolCallback to the execute target + '-- embedded client cfg -> ClientToolSpec + v +/run wire: customTools[i] = {kind:"callback", callRef:"workflow.", ...} OR {kind:"client", ...} + | +runner dispatch: + | callback -> model calls -> POST /tools/call -> API invokes the workflow revision + | client -> returned to the browser, fulfilled next turn + v +result -> back to the model +``` + +## The seams + +| Seam | File | Change | +| --- | --- | --- | +| Strict schema arms | `sdks/python/agenta/sdk/utils/types.py` | add an embed arm (mirror `_SkillEmbedRefSchema`) **and** a reference arm to `AgentConfigSchema.tools` so both a `@ag.embed` and a `@ag.reference` tool validate in the playground | +| Generic resolver "leave it" branch | SDK `ResolverMiddleware` + `api/oss/src/core/embeds/utils.py` | teach the generic resolver to **leave** an `@ag.reference` in place (inline only `@ag.embed`). Still tool-agnostic — no tool knowledge added | +| `resolve_tools` reference arm | `sdks/python/agenta/sdk/agents/tools/resolver.py` + a platform resolver in `.../platform/` | partition out a kept `@ag.reference`; resolve it to a `CallbackToolSpec` + the shared `ToolCallback`, mirroring the gateway path. The embed case arrives as a plain `ClientToolConfig` and needs no new arm | +| Server-side execute | `api/oss/src/apis/fastapi/tools/router.py` (+ core) | a `/tools/call`-style target that parses the `workflow.*` `call_ref`, invokes the referenced workflow revision with the model's arguments, and returns the result envelope | +| Wire | `services/agent/src/protocol.ts`, `sdks/python/agenta/sdk/agents/utils/wire.py`, golden fixtures | **no new field** — a reference rides as a `callback` spec, an embed as a `client` spec; only `call_ref` content is new | + +`call_ref` grammar for a referenced workflow: an opaque slug, e.g. `workflow.{slug}` or +`workflow.{slug}.{version}`. Distinct from the Composio 5-segment grammar +(`tools.{provider}.{integration}.{action}.{connection}`). The runner treats `call_ref` as +opaque; only the server-side parser must agree. `ResolvedToolSet` keeps its single shared +`tool_callback`; if both gateway and workflow tools are present, one endpoint routes by +`call_ref` prefix (`tools.*` vs `workflow.*`) — the smaller change, keeps the wire stable. + +## Out of scope (explicitly dropped from the old plan) + +- **No `workflow` tool variant.** A referenced workflow is just a workflow; no + `WorkflowToolConfig` in the discriminated union, no `"workflow"` `type` allowlist entry. +- **No platform tools as workflows.** Platform tools belong in the **existing tools + endpoints** (the same place gateway tools are added), not in the workflow catalog. Drop the + `_agenta.*` tool-workflow / `_validate_catalog` generalization direction entirely. (PR #4837 + review [3470356903](https://github.com/Agenta-AI/agenta/pull/4837#discussion_r3470356903).) +- **No Option A / Option B split.** There is one path; the only branch is the author's syntax + (`@ag.embed` vs `@ag.reference`). +- **`is_tool` flag** is a later, FE-only display hint — not built here. + +## Test plan + +- **SDK unit:** both `tools` arms validate (mirror the skills schema test) — a `@ag.embed` tool + and a `@ag.reference` tool; a kept `@ag.reference` resolves to the expected `CallbackToolSpec` + + `ToolCallback`; an `@ag.embed`-resolved concrete `client` config produces a `client` spec. +- **Schema:** `AgentConfigSchema` JSON Schema emits both the embed and reference `oneOf` arms in + `tools`; `CATALOG_TYPES["agent_config"]` still dereferences. +- **Generic resolver:** an `@ag.embed` in `tools[i]` is inlined to its value; an `@ag.reference` + in `tools[i]` is **left in place** (not inlined); cycle/depth guards hold for both. +- **`resolve_tools`:** a kept reference becomes a callback-bound `CallbackToolSpec`; an embedded + `client` config becomes a `client` spec. +- **Wire / golden:** a golden `/run` fixture with a referenced workflow tool (a `callback` spec) + and one with an embedded client tool (a `client` spec); `protocol.ts` Zod accepts both. +- **Execute endpoint:** a `/tools/call` with a `workflow.*` `call_ref` invokes the revision and + returns the result; the workflow's secrets/connections stay server-side. +- **Live matrix (agent-workflows-qa):** force a referenced workflow tool with an unguessable + token across pi_core / claude on local + Daytona + SDK; a pass proves it ran server-side and + the result reached the model. Pin a green cell with agent-replay-test. + +## Rollout + +POC, no flag needed for the schema arms (additive; the resolver already handles the embed arm +and gains a small "leave it" branch for the reference arm). The execute endpoint is new server +surface. Keep docs in sync in the same implementation PR (`documentation/tools.md`, +`interfaces/public-edge/agent-config-schema.md`, the interface inventory). + +## Build order (when implemented) + +1. Schema arms — `tools` accepts a `@ag.embed` tool and a `@ag.reference` tool; both validate in + the playground. +2. Generic resolver — add the "leave it" branch so `@ag.reference` passes through uninlined while + `@ag.embed` keeps inlining to its value. +3. `resolve_tools` — a kept `@ag.reference` → `CallbackToolSpec` + the execute endpoint; the + embedded `client` config already lands as a `client` spec. +4. (Later, FE) `is_tool` flag so referenced workflows surface in the tool picker. diff --git a/docs/design/agent-workflows/projects/embedref-tools/research.md b/docs/design/agent-workflows/projects/embedref-tools/research.md new file mode 100644 index 0000000000..d89acd1e6d --- /dev/null +++ b/docs/design/agent-workflows/projects/embedref-tools/research.md @@ -0,0 +1,212 @@ +# Research + +How `skills` embedding works today, the tool taxonomy, and the exact seams to mirror for +`tools` — under the **two-syntax** model (embed vs reference). Everything below is grounded in +the current code; file paths are absolute-from-repo. + +## Part 1 — How `@ag.embed` embedding works (the skills case), and what `@ag.reference` adds + +### There is no `EmbedRef` model — an embed is a structural marker + +An embed is a plain dict whose marker key is `@ag.embed`, recognized by a recursive walker. +There is no dedicated Pydantic class for it on the runtime path. + +- SDK marker: `sdks/python/agenta/sdk/middlewares/running/resolver.py` — + `_AG_EMBED_MARKER = "@ag.embed"`. +- API markers: `api/oss/src/core/embeds/utils.py` — + `AG_EMBED_KEY = "@ag.embed"`, `AG_REFERENCES_KEY = "@ag.references"`, + `AG_SELECTOR_KEY = "@ag.selector"`. + +**Confirmed: there is no reference-only marker today.** `@ag.references` and `@ag.selector` are +strictly **sub-keys inside an `@ag.embed` block** — they are not standalone top-level markers, +and the embed resolver always *inlines* the resolved value. The two-syntax model needs a **new +top-level marker** (e.g. `@ag.reference`, singular) that the same recursive walker recognizes but +treats as "leave in place" instead of "inline." It reuses the same inner `@ag.references` / +`@ag.selector` shape to name the target; only the inline-vs-leave behavior differs. + +The canonical object-embed shape (the form `skills` uses): + +```jsonc +{ + "@ag.embed": { + "@ag.references": { "workflow": { "slug": "_agenta.agenta-getting-started" } }, + "@ag.selector": { "path": "parameters.skill" } + } +} +``` + +`@ag.references` is `Dict[str, Reference]` keyed by entity type (`workflow`, +`workflow_revision`, ...). The inner `Reference` / `Selector` DTOs are in +`sdks/python/agenta/sdk/models/shared.py` (`Reference(id, slug, version)`, +`Selector(key, path)`). A bare `workflow` key is an **artifact-level** lookup (latest +revision); the comment in the default-config builder is load-bearing: referencing the +artifact (`workflow.slug`) resolves to the latest revision, while a bare *revision* slug with +no version returns 500. + +### The resolver is generic and runs BEFORE the agent handler + +Two layers: + +1. **SDK middleware** — `sdks/python/agenta/sdk/middlewares/running/resolver.py`. + `ResolverMiddleware.__call__` checks `_has_embed_markers(parameters)` (recursive: descends + dicts, **lists**, and strings) and, if any embed is present and the `resolve` flag is on, + POSTs `parameters` to `{api}/workflows/revisions/resolve` and replaces them with the + resolved result. Its own comment says: *"The embed resolver walks arrays, so an + `@ag.embed` inside `parameters.skills[i]` resolves on either path."* The same is true of + `parameters.tools[i]`. Under the two-syntax model the marker check also recognizes + `@ag.reference`, but the resolve pass **leaves that node untouched** (inline-vs-leave is the + only difference); the walk and the list-descent are unchanged. + +2. **API generic resolver** — `api/oss/src/core/embeds/utils.py`, `resolve_embeds(...)`. + It deep-copies the config and loops up to `max_depth`, each pass calling + `find_object_embeds(...)` (a recursive walker that records an `ObjectEmbed{location, + references, selector}` for every dict carrying `@ag.embed`, and **recurses into list items + and dict values otherwise**). For each embed it: resolves the references via a callback, + applies the `@ag.selector` `path` to the resolved revision's `data` + (`_extract_with_sdk_resolver`, using the SDK `resolve_any`), and `set_path(...)` + substitutes the extracted value back at the embed's location. Cycle / depth / count + guards exist (`CircularEmbedError`, `MaxDepthExceededError`, `MaxEmbedsExceededError`). + +The resolver callback routes a `workflow` reference to +`workflows_service.fetch_workflow_revision(...)` (`api/oss/src/core/embeds/service.py` wires +`EmbedsService` to the same catalog-aware `WorkflowsService`). + +**Ordering in the agent run path** (`services/oss/src/agent/app.py`, `_agent`): + +1. Resolution — done by the SDK middleware against `parameters`, before `_agent` is even + called. `@ag.embed` nodes are inlined to their value; `@ag.reference` nodes are **left in + place**. +2. `agent_config = AgentConfig.from_params(params, ...)` — parses the config. An inlined + `@ag.embed` is now a concrete tool config; a kept `@ag.reference` parses as the reference arm. +3. `resolved_tools = await resolve_tools(agent_config.tools)` — sees concrete tool configs **and** + any kept `@ag.reference` arms. + +**Implication:** an `@ag.embed` in `tools[i]` is inlined at step 1 with only a tiny resolver +addition (the "leave it" branch for the sibling `@ag.reference` marker — embed inlining itself is +unchanged). A kept `@ag.reference` survives to step 3, where `resolve_tools` does the +tool-specific mapping. The work is two schema arms (step 2) plus the `resolve_tools` reference arm +(step 3); the generic resolver gains only the "leave it" branch. + +### The `_agenta.*` platform catalog short-circuit (background only) + +`api/oss/src/core/workflows/platform_catalog.py` defines `PlatformWorkflowCatalog`, a +code-defined, read-only set of platform workflows keyed by a reserved `_agenta.*` slug. +`WorkflowsService.fetch_workflow_revision` calls `_resolve_platform_revision` *first*; a +reserved slug never falls through to Postgres. This is how the default skill embed resolves +(`_agenta.agenta-getting-started`). + +**Not relevant to this design's scope.** Per the PR #4837 review, platform *tools* do **not** +go in this catalog — they belong in the existing tools endpoints (like gateway). So this design +does **not** touch `_validate_catalog` (which today validates catalog payloads as `SkillConfig`) +and does not ship `_agenta.*` tool workflows. User-authored workflows referenced as tools live +in the DB and never hit this validation. + +### Where the union lives (skills, the template to copy) + +- Runtime `AgentConfig.skills`: `sdks/python/agenta/sdk/agents/dtos.py` — + `skills: List[SkillConfig]` (NOT a union; embeds are already resolved by the time it + parses). A `@field_validator("skills", mode="before")` coerces. +- Strict `AgentConfigSchema.skills`: `sdks/python/agenta/sdk/utils/types.py` — + `List[Union["SkillConfigSchema", "_SkillEmbedRefSchema"]]`. The embed arm is + `_SkillEmbedRefSchema` with `embed: Dict[str, Any] = Field(alias="@ag.embed")` and + `extra="forbid"`. This is the exact arm to mirror for the tools **embed** arm. Tools add one + more arm — a `_ToolReferenceSchema` with `reference: Dict[str, Any] = Field(alias="@ag.reference")` + — for the kept-reference syntax. Skills do not need it (a skill is always a value). +- Default config: `build_agent_v0_default(...)` in + `sdks/python/agenta/sdk/utils/types.py` ships the skill `@ag.embed` block. + +## Part 2 — The tool taxonomy (what each syntax must become) + +### Two lives, three axes + +`documentation/tools.md` is the canonical reference. A tool has a **declared config** +(`AgentConfig.tools`, portable, no secrets) and a **resolved spec** (the `/run` wire, secrets +injected, endpoints filled). Three orthogonal axes: **executor** (`type` at config time, +`kind` at runtime), **`needs_approval`**, **`render`**. + +Declared `type` -> resolved `kind`: + +| Declared `type` | Resolved form | Resolved `kind` | Who executes / where | +| --- | --- | --- | --- | +| `builtin` | a bare name | (none) | the harness, natively | +| `gateway` | `CallbackToolSpec` + `call_ref` | `callback` | the Agenta service, via `POST /tools/call` | +| `code` | `CodeToolSpec` + `env` | `code` | the runner, local subprocess | +| `client` | `ClientToolSpec` | `client` | the browser, next turn | + +Models: `sdks/python/agenta/sdk/agents/tools/models.py` +(`ToolConfigBase`, the four `*ToolConfig`, the `ToolConfig = Annotated[Union[...], +Field(discriminator="type")]`, and the resolved `CallbackToolSpec` / `CodeToolSpec` / +`ClientToolSpec` discriminated by `kind`; `ResolvedToolSet{builtin_names, tool_specs, +tool_callback}`). The TS twin is `ResolvedToolSpec` in `services/agent/src/protocol.ts`. + +### How resolution + dispatch work + +- SDK `ToolResolver.resolve` (`sdks/python/agenta/sdk/agents/tools/resolver.py`) partitions + configs by `isinstance`, resolves code secrets via a `ToolSecretProvider`, resolves gateway + configs via a `GatewayToolResolver` (which returns the `CallbackToolSpec` list **and** the + single shared `ToolCallback`), and returns a `ResolvedToolSet`. +- Platform composition `resolve_tools` (`sdks/python/agenta/sdk/agents/platform/resolve.py`) + wires the Agenta adapters (`AgentaNamedSecretProvider`, `AgentaGatewayToolResolver`). +- The gateway adapter (`sdks/python/agenta/sdk/agents/platform/gateway.py`) POSTs to + `POST /tools/resolve`, gets a `call_ref` slug + `tools.{provider}.{integration}.{action}.{connection}`, wraps each in a `CallbackToolSpec`, + and assembles one `ToolCallback(endpoint="{api}/tools/call", authorization=...)`. +- Runner dispatch `runResolvedTool` (`services/agent/src/tools/dispatch.ts`) branches on + `kind`: `code` runs locally; `client` throws (browser-fulfilled); **`callback` (default) + POSTs back to `/tools/call`** (directly, or via the Daytona file relay). Absent `kind` + defaults to `callback`. + +### Why `callback` for `@ag.reference` and `client` for `@ag.embed` + +The branch is the **author's syntax** (see [plan.md](plan.md)), and the taxonomy already has a +home for each: + +- An **`@ag.reference`** workflow tool is **server-executed**: calling it means invoking another + Agenta workflow revision, which lives behind the API and may itself use connections and + secrets. That is exactly the gateway tool's safety shape — the harness decides *which* tool and + *with what arguments*, the service runs it, and no credential reaches the sandbox. So + `resolve_tools` maps it to a `CallbackToolSpec`. The runner needs **no new `kind`** — `callback` + already dispatches to `callAgentaTool`, works under the Daytona file relay, and is delivered to + both Pi (native) and Claude (the `agenta-tools` MCP bridge). The only difference from a gateway + tool is the `call_ref` grammar and the execute target: instead of a Composio action, the + service invokes a workflow revision. **Crucially, the reference is *not* inlined before + `resolve_tools` runs** — that is the whole point of the second syntax: the generic resolver + leaves it, so `resolve_tools` sees the kept reference (slug + version + the model-facing + surface) and builds the callback spec from it. The callback path never needs the *resolved + workflow artifact* at config time; it carries only the identity (`call_ref`) and resolves the + revision lazily, server-side, when the model actually calls the tool. +- An **`@ag.embed`** (client) workflow tool fits the existing **`client`** executor: the generic + resolver inlines the reference into a concrete `client` tool config *before* `resolve_tools` + runs, so `resolve_tools` sees a plain `ClientToolConfig` and the runner returns a `client` spec + for the browser to fulfill next turn (`models.py:206` — `kind: "client"`). No callback, no + server-side execute. + +This is what the earlier "keep the reference but it's already inlined" tension was about: with a +single `@ag.embed` syntax, a tool could not both stay a reference for callback resolution *and* be +inlined before `resolve_tools`. The two-syntax model removes the contradiction — embed inlines, +reference is kept — so each path sees exactly the form it needs. + +## Part 3 — The seams to touch (summary) + +| Seam | File | Change | +| --- | --- | --- | +| Strict schema arms | `sdks/python/agenta/sdk/utils/types.py` | add `_ToolEmbedRefSchema` (alias `@ag.embed`) **and** `_ToolReferenceSchema` (alias `@ag.reference`); make `AgentConfigSchema.tools` a `Union[ToolConfig-twin, _ToolEmbedRefSchema, _ToolReferenceSchema]` (the embed arm mirrors skills; the reference arm is new) | +| Generic resolver "leave it" branch | SDK `ResolverMiddleware` + `api/oss/src/core/embeds/utils.py` | recognize the new `@ag.reference` marker and **leave it in place** (inline only `@ag.embed`). Tool-agnostic — no tool knowledge added | +| `resolve_tools` reference arm | `sdks/python/agenta/sdk/agents/tools/resolver.py` + a platform resolver in `.../platform/` | partition out a kept `@ag.reference`; resolve it to a `CallbackToolSpec` + a `ToolCallback` to the new execute endpoint (mirror gateway). The embed case arrives as a plain `ClientToolConfig` — no new arm | +| Server-side execute | `api/oss/src/apis/fastapi/tools/router.py` (+ core) | a `/tools/call`-style target that invokes the referenced workflow revision and returns the result | +| Wire | `services/agent/src/protocol.ts`, `sdks/python/agenta/sdk/agents/utils/wire.py`, golden fixtures | **no new field** — a reference rides as a `callback` spec, an embed as a `client` spec; only the `call_ref` content is new | +| Docs | `documentation/tools.md`, `interfaces/public-edge/agent-config-schema.md`, the interface inventory | document both syntaxes + the syntax-decides-behavior model | + +No `WorkflowToolConfig` variant, no `compat.py` `"workflow"` allowlist entry, no +platform-catalog change — all dropped per the PR #4837 review. + +## Open research questions (carried into the plan) + +1. **The `@ag.reference` marker shape** — confirm it reuses the inner `@ag.references` / + `@ag.selector` block (same target-naming as `@ag.embed`) and differs only in the "leave it" + behavior; confirm the singular `@ag.reference` name. +2. **What does invoking the workflow mean** — call `/workflows/.../invoke` with the model's + arguments as inputs, and map the workflow output back as the tool result? What is the + input/output contract between a tool call and a workflow invoke? +3. **The `call_ref` grammar** for a runnable workflow tool (today's 5-segment gateway grammar + is Composio-specific and parsed in both `compat.py` and the API router). diff --git a/docs/design/agent-workflows/projects/embedref-tools/status.md b/docs/design/agent-workflows/projects/embedref-tools/status.md new file mode 100644 index 0000000000..024e56f8f6 --- /dev/null +++ b/docs/design/agent-workflows/projects/embedref-tools/status.md @@ -0,0 +1,137 @@ +# Status + +This is the source of truth for the project's progress, decisions, and open questions. + +## Current state + +- **Phase:** IMPLEMENTED (the lgtm'd two-syntax design #4837). Spun from PR #4821 review comment + [3469653315](https://github.com/Agenta-AI/agenta/pull/4821#discussion_r3469653315). +- **Docs:** README, context, research, plan, status written, then revised twice on PR #4837. + Iteration 2 simplified to one path branching on runnable-vs-not (dropped Option A/B, the + `workflow` tool variant, platform-tools-as-workflows). **Iteration 3** (per the author's + comment [3473648119](https://github.com/Agenta-AI/agenta/pull/4837#discussion_r3473648119)) + replaces the "infer runnable/not server-side in a resolve step" mechanism with **two syntaxes**: + `@ag.embed` (inline the value) and a new `@ag.reference` (keep the reference). The author's + syntax choice decides the behavior; the generic resolver stays tool-agnostic; the tool-specific + logic lives in `resolve_tools`. +- **Built (this slice):** + - SDK marker + config: `AG_REFERENCE_MARKER` and `ReferenceToolConfig` + (`type: "reference"`, `slug`/`version`/`name`/`description`/`input_schema`, `.call_ref` + `workflow.{slug}[.{version}]`) in `tools/models.py`; `compat.py` coerces the kept + `@ag.reference` marker into it. + - `resolve_tools` mapping: a new `WorkflowToolResolver` port + `AgentaWorkflowToolResolver` + platform adapter (`platform/workflow.py`) build a `CallbackToolSpec` + the shared + `ToolCallback`; `ToolResolver` partitions reference configs and reconciles the single + callback with gateway. The generic resolver stays tool-agnostic. + - Generic resolver "leave it" guard: `AG_REFERENCE_KEY` in `api/oss/src/core/embeds/utils.py` + — all three finders treat a kept `@ag.reference` node as opaque (the SDK `_has_embed_markers` + already ignores it, so an embed-free reference simply passes through). + - Strict schema arms: `_ToolEmbedRefSchema` + `_ToolReferenceSchema` on + `AgentConfigSchema.tools` (a union) in `utils/types.py`. + - Server-side execute: `/tools/call` routes a `workflow.*` call_ref to `_call_workflow_tool` + (`api/oss/src/apis/fastapi/tools/router.py`), which invokes the workflow revision via + `WorkflowsService.invoke_workflow` (wired into `ToolsRouter` in `entrypoints/routers.py`). + - Wire: UNCHANGED. A reference rides as a `callback` spec, an embed as a `client` spec; only the + `call_ref` content (`workflow.*`) is new. Golden fixtures untouched. + - Tests: SDK (parsing/models/resolver/platform/catalog) + API (embeds leave-it + router + execute branch). Live end-to-end DEFERRED to the dedicated embedref live QA (after the gate). +- **Next:** CTO (JP) review of the PR; live end-to-end QA. + +## Design + +**A tool is just a workflow, pointed at via one of two syntaxes.** `tools[i]` carries either an +`@ag.embed` (inline the value) or an `@ag.reference` (keep the reference). Any workflow type +qualifies — agent, completion, channel, chain. There is **no `workflow` tool variant** and **no +"tool workflow" type**. **The author's syntax decides the behavior** (the decision is *not* +inferred server-side by inspecting the target): + +- **`@ag.reference`** (new — for a runnable workflow you want to *call*). The generic resolver + **leaves the reference in the config**. `resolve_tools` turns it into the existing **`callback`** + executor — a `CallbackToolSpec` whose `call_ref` encodes the workflow identity, plus the shared + `ToolCallback` to a server-side execute endpoint. The model's call routes back, the service + invokes the workflow revision, the result returns. Connections/secrets stay server-side, exactly + like a gateway tool. **No new runner `kind`.** +- **`@ag.embed`** (existing — for a non-runnable client tool that is a *value*). The generic + resolver **resolves the reference into its value** — a concrete `client` tool config. By the + time `resolve_tools` runs it is a plain `ClientToolConfig` and rides the existing `client` path + (fulfilled in the browser next turn). + +**The decision boundary:** the generic resolver (`ResolverMiddleware` + the API embed resolver) +knows only inline-the-value (`@ag.embed`) vs leave-the-reference (`@ag.reference`) and **nothing +about tools**; **`resolve_tools`** owns all tool-specific mapping (kept reference → callback spec; +embedded value → client spec). + +Unifying rule for the sidecar: point at everything as a tool; a kept reference is run (the +callback executes the workflow), an embedded value is a concrete `client` tool config fulfilled in +the browser. + +**Why `callback` for the reference case:** a referenced workflow tool is server-executed and may +use connections/secrets — exactly the gateway tool's safety shape. Resolving to a +`CallbackToolSpec` keeps every credential server-side and reuses the runner's existing callback +delivery (direct, Daytona relay, Pi native, Claude `agenta-tools` bridge). + +**Explicitly dropped across iterations** (per the author's PR #4837 reviews): + +- iteration 2: the Option A / Option B framing; the `WorkflowToolConfig` variant / the + `"workflow"` `type` allowlist entry (a tool is just a workflow); **platform tools as workflows** + (they go in the existing tools endpoints, like gateway, not the workflow catalog, so the + `_validate_catalog` generalization is gone); +- iteration 3: inferring runnable/not server-side in a resolve step — replaced by the + author-chosen syntax (`@ag.embed` vs `@ag.reference`). + +## Settled by research + +- The resolver is **generic and already walks `tools[]`** — embedding needs no resolver change, + and the new `@ag.reference` syntax adds only a small "leave it" branch (no tool knowledge). + (`ResolverMiddleware` + `api/oss/src/core/embeds/utils.py`.) This is the load-bearing finding + and it survives the reframe. +- There is **no reference-only marker today** — `@ag.references` / `@ag.selector` are sub-keys + inside an `@ag.embed`. The two-syntax model adds a new top-level `@ag.reference` marker. +- An `@ag.embed` resolves **before** `AgentConfig.from_params` and `resolve_tools` (so by tool + resolution it is concrete); an `@ag.reference` is **deliberately kept** so `resolve_tools` sees + it and builds the callback spec. +- The skills schema arm (`_SkillEmbedRefSchema`) is the template for the tools embed arm; the + reference arm (`_ToolReferenceSchema`) is new and tools-only. + +## Settled by the author (was open, now closed) + +- **Where the tool-specific decision lives** — in **`resolve_tools`**, not a server-side resolve + step that inspects the target. The generic resolver only does inline-vs-leave; the + reference-vs-embed choice is the author's, encoded in the syntax. (Closes the old "where the + runnable/not decision lives" question; resolves CodeRabbit's "intro reads settled but status + treats it as open" flag.) + +## Open questions for the user + +1. **Tool-call to workflow-invoke contract** — how do the model's tool arguments map to the + workflow's invoke inputs, and how does the workflow output map back to the tool result? + Free-form passthrough, or a declared input/output schema? +2. **`call_ref` grammar for referenced workflow tools** — `workflow.{slug}` / + `workflow.{slug}.{version}`? Today's gateway grammar + (`tools.{provider}.{integration}.{action}.{connection}`) is Composio-specific and parsed in + two places; a workflow tool needs its own opaque slug. +3. **The `@ag.reference` marker name/shape** — confirm the singular `@ag.reference` top-level + marker reusing the inner `@ag.references` / `@ag.selector` block (same target-naming as + `@ag.embed`, differing only in leave-vs-inline). +4. **Single shared callback endpoint vs per-spec callbacks** — `ResolvedToolSet` holds one + `tool_callback`. With both gateway and workflow tools present, route one endpoint by + `call_ref` prefix (smaller change, recommended) or grow the wire to per-spec callbacks? +5. **`is_tool` FE flag** — confirm it is deferred (later, display-only so referenced workflows + surface in the tool picker) and not part of this slice. +6. **Approval / render axes** — a referenced or embedded tool can carry `needs_approval` and + `render` like any tool; confirm no special handling is wanted (default: they compose as usual). + +## Risks / watch-fors + +- **One callback channel.** The single `tool_callback` is a real constraint if mixing tool + types; the prefix-routing answer (Q4) avoids a wire change. +- **Reference the artifact** (`workflow.slug`), not a bare revision slug with no version + (returns 500) — same gotcha skills have. +- **Two models, one contract.** The strict `AgentConfigSchema` and the permissive runtime + `AgentConfig` must move together (and a golden fixture), per agent-config-schema.md's + "watch for when changing." +- **New marker, two resolvers.** The `@ag.reference` marker must be recognized in **both** the + SDK `ResolverMiddleware` and the API embed resolver, and both must agree to *leave it* (not + inline). A miss in either inlines a reference and breaks the callback path. +- **Keep docs in sync** in the implementation PR: `documentation/tools.md`, + `interfaces/public-edge/agent-config-schema.md`, and the interface inventory.