docs(agent): frontend round-trip design (client tools, commit refresh, connections) by mmabrouk · Pull Request #4920 · Agenta-AI/agenta

mmabrouk · 2026-06-28T19:02:16Z

Context

This is the Part 2 design for the agent frontend round-trip, written for Arda to review the
approach before any frontend work starts. It is docs only.

An agent sometimes needs the human in the middle of a run. It rewrites its own config and wants
approval first. Or it needs a GitHub connection that does not exist yet, and the user must finish an
OAuth flow before the agent can continue. Both cases pause the run, show something in the playground,
wait for the user, and resume the agent with the result.

The finding that drives the design: the HITL approval flow we already ship is one half of this
round-trip, and the Vercel AI SDK ships the matching half. The runner's rule "forbid client tools"
becomes "emit the call and park." No new transport.

What the design proposes

One generic client-tool round-trip. A tool with no server execute is fulfilled by the browser.
The runner streams the call, parks the run, and waits. The playground renders a widget, the user
acts, and the result returns on the next turn as a tool_result keyed to the same call. This is the
permission flow generalized: the outbound side carries any tool call, the inbound side carries any
structured output. The park, the cold-replay, and the resume stay as they are.

request_connection as a non-runnable reference tool. The earlier draft modeled it as a
platform op. That was wrong. A platform op always binds to an HTTP method and path the runner calls,
and this tool has neither, because the browser does the work. The corrected model is a hard-coded
workflow the build kit embeds with @ag.embed, the same way it embeds the authoring skill. The embed
resolver inlines that embed into a browser-fulfilled client tool. So the round-trip needs no new
tool shape and no new flag.

The abandon flow resolved. An unsettled tool part hangs the resume, so the widget settles the
parked call on every terminal path, including abandon. It detects a closed popup and settles a
cancel result, with a timeout backstop if no terminal signal arrives within a bound. The chip then
reads "Connection not completed" with a Retry action, and the agent gets a definite failure result
so it can re-ask or move on. It never waits forever.

Build-kit alignment with #4917. request_connection joins the agent-template overlay as a
reference-tool entry, beside the authoring skill's @ag.embed, and is never committed. The doc now
uses the finalized container path additional_context.playground_build_kit.agent_template_overlay.
There is no run flag and no service-side injection.

Two review fixes. The cold-replay anchor is renamed from approvalKey to the neutral
parkedCallKey, since a client tool is not an approval gate. The connect widget must validate
event.origin against the Agenta API origin before it trusts the OAuth callback postMessage,
stated as a hard requirement, with a state binding as defense in depth.

Scope / risk

Docs only. No code changes. The design reuses the existing HITL park-and-resume machinery rather
than adding a transport, so the runtime risk lands at implementation time, not here. Out of scope and
recorded in the doc: the per-tool config-diff widget, runner auto-interception of a missing
connection, and the skill that teaches the discover-then-connect loop (owned by agent-skills). The
build-kit overlay itself is owned by default-agent-config (#4917); this doc references it and changes
neither.

How to QA

Read docs/design/agent-workflows/projects/agent-fe-roundtrip/design.md and status.md. Confirm the
approach reads correctly, in particular the request_connection reframe (the load-bearing claim that
an @ag.embed of a workflow tool already inlines to a browser-fulfilled client tool), the abandon
resolution, and the two review fixes. Inline focus-pointers mark the sections worth the most
attention.

…, connections) Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

vercel · 2026-06-28T19:02:21Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 28, 2026 9:01pm

coderabbitai · 2026-06-28T19:02:22Z

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main design change: a frontend round-trip for client tools, commit refresh, and connection handling.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The description matches the docs-only round-trip design changes and explains the two documented use cases.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/agent-fe-roundtrip

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

mmabrouk · 2026-06-28T19:06:05Z

🤖 The AI agent says:

What this covers. One generic client-tool round-trip: the run pauses, the playground shows the user something, the user acts, and the run resumes with the result. It applies twice: the agent changing its own config, and the agent requesting a connection that does not exist. It widens the human-in-the-loop approval flow we already ship rather than inventing a transport.

Where to look. The locked decisions D1 to D5, Part 1b (the client-tool mechanism grounded in the Vercel AI SDK), and the owner split.

Feedback needed (two passes). First, please confirm the overall approach. Then this goes to Arda for an approach review, since most of the work and the central new surface are his. The items that most need eyes:

The client-tool registration and dispatcher: one execution-location flag flowing catalog to spec to wire to playground, with dispatch by render.kind then tool name.
The resume envelope and the generalized resume predicate: a settled client-tool output resumes the run, keyed to the parked call, carrying a reference and never the secret.
The result display choice: an inline status chip (U1, lean) versus a chat message (U2).

…kit aligned) Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

mmabrouk · 2026-06-28T19:43:38Z

🤖 The AI agent says: I rewrote this design from the first version, which was rejected for quality. The doc now turns on one idea: the human-in-the-loop round-trip we already ship is a client-tool round-trip. The runner's "forbid client tools" becomes "emit the call and park," and that single primitive carries both jobs: the agent committing its own config, and the agent requesting a connection. I re-grounded every claim against code and folded in two corrections (the platform op model has no executor field today, and a client tool needs no new Vercel frame). This PR had no prior review comments, so nothing was reopened.

Three items most need eyes:

Client-tool registration and dispatch. A client tool is identified two ways: implicitly by Vercel (providerExecuted falsy, no server output), and explicitly through a playground registry keyed by render.kind then name. An unknown client tool is an error surface with a generic widget, never a silent hang. Is one explicit registry the right contract, and is render.kind then name the right dispatch precedence?
The reference-only resume envelope. The result returns as a tool_result keyed to the parked call, carrying { connected, integration, slug } and never the secret. The runner re-resolves the credential from the project vault on resume. Worth a look at whether the reference shape and the re-resolve path are right, and that failure and cancel both settle the call.
The result display choice (U1 vs U2). U1 is an inline status chip in the tool-activity row, consistent with the approval UI and low noise. U2 is a plain chat message, more legible but noisier. The doc leans U1. This is Arda's call and worth a sanity check.

Ready for Arda's approach review after Mahmoud's pass. Most of this is frontend, and the central new surface is Arda's.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

docs/design/agent-workflows/projects/agent-fe-roundtrip/design.md (1)

349-349: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Capitalize "GitHub".

Per standard product spelling and the static analysis hint, use "GitHub" not "github".

Source: Linters/SAST tools

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: c4fb48c6-52ac-4d0e-8eb5-c81ebfd96ed8

📥 Commits

Reviewing files that changed from the base of the PR and between ebc4ec1 and 6045bba.

📒 Files selected for processing (2)

docs/design/agent-workflows/projects/agent-fe-roundtrip/design.md
docs/design/agent-workflows/projects/agent-fe-roundtrip/status.md

mmabrouk · 2026-06-28T20:24:01Z

+unchanged from the catalog to the wire to the playground. We do not scatter it across
+feature-named fields.
+
+Layer A, the platform catalog (us). `request_connection` is a hard-coded platform op beside


This doesn't look correct because Request Connection, being a client-side tool, is not actually a platform op. Platform ops always connect to an API call, so in this case it looks like Request Connection is in fact a platform tool, just like we have platform skills that are hard-coded, which are basically workflows that are hard-coded. The way we would represent it, probably in the kind of inspect file, is that the build kit for the playground is through an embed.

This is very important here, and you need to read the other PRs we're working on and coordinate with the orchestrator to make sure that this part is clear both to you and to the other PR.

🤖 The AI agent says: Adopted. request_connection is now a non-runnable reference tool: a hard-coded workflow embedded via @ag.embed, the same way a platform skill is embedded, which the embed resolver inlines into a browser-fulfilled client tool. It is not a platform op, because it has no method or path for the runner to call. In #4917 it appears in the agent-template overlay as a reference-tool entry, never committed. See the sections "request_connection: a non-runnable reference tool" and "Build-kit alignment."

mmabrouk · 2026-06-28T20:24:57Z

+
+The wire envelope is identical either way. This is pure presentation, and it is Arda's call.
+
+- Option U1, an inline status chip (lean). The connect interaction renders as a compact


Sounds good.

🤖 The AI agent says: Acknowledged, thank you. No change needed here.

…UX, security) Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

mmabrouk · 2026-06-28T21:03:37Z

+On every path the agent receives a definite result, so it can re-ask or move on rather than wait
+forever. Application 2 covers the connect-specific UX of the incomplete state.
+
+## `request_connection`: a non-runnable reference tool


🤖 The AI agent says: Focus here. The load-bearing reframe. The key claim to confirm: an @ag.embed of a workflow tool already inlines to a browser-fulfilled client tool, so this needs no new tool shape and no new flag.

mmabrouk · 2026-06-28T21:28:03Z

+
+## `request_connection`: a non-runnable reference tool
+
+This is the corrected model, and it follows ownership rather than transport.


I don't think we need all the details here in this document. Again, please don't mention past iterations of the documents within the document. We just need to say that the tool will be represented as an embed, and therefore it will follow the normal flow of things.

mmabrouk

lgtm

docs(agent): frontend round-trip design (client tools, commit refresh…

1b02515

…, connections) Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

mmabrouk added the needs-review Agent updated; awaiting Mahmoud's review label Jun 28, 2026

docs(agent): rewrite frontend round-trip design (plan-feature, build-…

6045bba

…kit aligned) Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

vercel Bot deployed to Preview June 28, 2026 19:43 View deployment

mmabrouk requested a review from ardaerzin June 28, 2026 20:19

coderabbitai Bot reviewed Jun 28, 2026

View reviewed changes

Comment thread docs/design/agent-workflows/projects/agent-fe-roundtrip/design.md Outdated

Comment thread docs/design/agent-workflows/projects/agent-fe-roundtrip/design.md Outdated