feat(agent): client-tool round-trip backend (#4920) by mmabrouk · Pull Request #4925 · Agenta-AI/agenta

mmabrouk · 2026-06-28T22:18:44Z

Context

Some tools the agent uses require user interaction before they can complete. The OAuth connection flow is the main case: the agent asks the user to connect an integration (GitHub, Slack, etc.), the user finishes sign-in in the browser, and the run must resume with the result. The existing runner handled only approved/denied permission decisions and had no path for structured output from a client-side tool call.

This PR wires the backend side of that round-trip. The frontend slice (message-part dispatcher, connect widget, resume predicate, result UX, config refresh on commit) is a separate deliverable, coordinated with @ardaerzin.

Design: docs/design/agent-workflows/projects/agent-fe-roundtrip/design.md

Changes

TS runner (services/agent/):

responder.ts — Responder gains onClientTool(req). ApprovalDecisions now holds { output: unknown } in addition to permission decisions. parkedCallKey replaces approvalKey everywhere (the old name implied approval-only semantics).
permissions.ts — Before dispatching a tool, checks spec.kind === "client". If so, emits interaction_request(kind: "client_tool") and parks via responder.onClientTool. The tool call stays parked until the frontend sends a result.
dispatch.ts — Client tools no longer throw. They return a {type: "client_tool_pending"} signal; the permission responder owns the park.

SDK (sdks/python/):

adapters/vercel/messages.py — TOOL_OUTPUT_AVAILABLE_TYPES is now a set containing both "tool-output-available" and "TOOL_OUTPUT_AVAILABLE". The previous check was case-exact and dropped uppercase variants from some clients.
agents/platform/workflow.py — Registers __ag__request_connection as a hard-coded client tool that resolves to ClientToolSpec(kind="client", render={"kind": "connect"}). Agents that include this workflow as a tool get the connect affordance without any backend call.

API:

core/workflows/static_catalog.py — Registers __ag__request_connection under URI client:tool:request_connection:v0 alongside the getting-started skill.
apis/fastapi/tools/router.py and workflows/router.py — Both commit_revision paths now emit a one-way data-committed-revision event {variantId, revisionId, version} after a successful commit. The frontend uses this to refresh the build-kit state without a separate poll.

Scope / risk

This is the backend slice only. The interaction loop is incomplete until the frontend dispatches client_tool interaction events and sends results back. Until that lands, a kind: "client" tool call will emit the event and park but never resume. The agent will appear to hang at that point.

The _emit_data_event helper in both API routers resolves request.state.emit with a fallback to emit_event. If neither attribute exists (e.g., outside a streaming context), it exits silently. Reviewers should confirm the emit attribute is reliably present on request.state in the tool-call and workflow-commit code paths, or that silent failure is the right behavior.

The parkedCallKey rename is purely cosmetic. It does not change behavior. The test files were updated to match.

Tests

Vitest: responder.test.ts, sandbox-agent-orchestration.test.ts, sandbox-agent-permissions.test.ts updated for parkedCallKey rename and expanded Responder interface.
Codex-reported: 44 TS tests across 3 files, all passing. ruff format+check clean on API + SDK. tsc --noEmit clean.

How to QA

Prerequisites: local dev stack with the agent sidecar running. The frontend slice is not yet merged, so the round-trip cannot be exercised end-to-end. This PR can be verified at the unit-test level only.

Automated tests:

cd services/agent && npx vitest run tests/unit/responder.test.ts tests/unit/sandbox-agent-permissions.test.ts
cd api && uv run python -m pytest oss/tests/pytest/ -k "commit_revision" -v

What to verify: Run the TS tests and confirm the parkedCallKey rename did not break any assertion. Confirm the API router tests still pass after the _emit_data_event addition.

Edge cases: The _emit_data_event call in the tools router reads outputs as a dict and exits if variantId/revisionId/version are absent. Confirm a commit_revision call that returns a non-dict output does not raise.

https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

vercel · 2026-06-28T22:18:50Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 29, 2026 12:08pm

coderabbitai · 2026-06-28T22:18:51Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 03fe9b8e-5ee4-432b-abfb-5699fafb4d63

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.74% which is insufficient. The required threshold is 60.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title concisely captures the main backend change: client-tool round-trip support in the agent stack.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The description clearly matches the backend client-tool round-trip, workflow, SDK, and event-emission changes in the diff.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/client-tool-roundtrip-4920

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

mmabrouk · 2026-06-28T22:19:05Z

@ardaerzin — this is your counterpart for the FE round-trip design.

The backend is done (see PR description). Your frontend slice needs:

Message-part dispatcher — when the runner emits an interaction_request with kind: "client_tool", surface it to the frontend message stream so the browser can handle it.
Connect widget — render a UI element for render: {kind: "connect"} tool requests (the request_connection tool).
Resume predicate — isClientToolResult(part) to detect when the user has fulfilled a client tool call and the runner should resume.
Result UX — display the fulfilled tool output inline in the chat.
Config/build-kit refresh — listen for data-committed-revision events (backend now emits them on every successful commit_revision) to refresh the active variant/revision without a full reload.

The design file has the full detail: docs/design/agent-workflows/projects/agent-fe-roundtrip/design.md (section "Work split -> Arda (FE)").

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (1)

services/agent/tests/unit/sandbox-agent-permissions.test.ts (1)

36-38: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add direct coverage for the client-tool branch.

These stubs keep the expanded Responder interface satisfied, but the new attachPermissionResponder path should also assert a request with toolCall.spec.kind === "client" emits kind: "client_tool", parks without replying, and maps deny/output outcomes through clientToolReply().

Also applies to: 108-110, 147-149

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 8737fc66-7334-4508-82bf-da22b81679d9

📥 Commits

Reviewing files that changed from the base of the PR and between ebc4ec1 and be5bd45.

📒 Files selected for processing (11)

api/oss/src/apis/fastapi/tools/router.py
api/oss/src/apis/fastapi/workflows/router.py
api/oss/src/core/workflows/static_catalog.py
sdks/python/agenta/sdk/agents/adapters/vercel/messages.py
sdks/python/agenta/sdk/agents/platform/workflow.py
services/agent/src/engines/sandbox_agent/permissions.ts
services/agent/src/responder.ts
services/agent/src/tools/dispatch.ts
services/agent/tests/unit/responder.test.ts
services/agent/tests/unit/sandbox-agent-orchestration.test.ts
services/agent/tests/unit/sandbox-agent-permissions.test.ts

coderabbitai · 2026-06-28T23:52:18Z

+async def _emit_data_event(
+    *,
+    request: Request,
+    name: str,
+    data: dict,
+) -> None:
+    emit = getattr(request.state, "emit", None) or getattr(
+        request.state, "emit_event", None
+    )
+    if not callable(emit):
+        return
+
+    result = emit({"type": "data", "name": name, "data": data})
+    if isawaitable(result):
+        await result


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Don’t let UI event delivery fail a successful tool call.

If the request-state emitter throws here, _call_workflow_tool() returns an error even though the workflow already ran and produced outputs. This helper should be best-effort for the same reason as the workflow router variant.

coderabbitai · 2026-06-28T23:52:18Z

+        await _emit_committed_revision_data_event(
+            request=request,
+            workflow_revision=workflow_revision,
+        )


🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Emit the same event from the normal commit endpoint.

This only covers create_workflow_revision(). The regular commit path at Line 1500 still returns without a committed-revision event, so follow-up commits will not trigger the frontend revision refresh that this PR is adding.

Minimal follow-up

# In commit_workflow_revision(), after invalidate_cache(...) await _emit_committed_revision_data_event( request=request, workflow_revision=workflow_revision, )

coderabbitai · 2026-06-28T23:52:18Z

+async def _emit_data_event(
+    *,
+    request: Request,
+    name: str,
+    data: dict,
+) -> None:
+    emit = getattr(request.state, "emit", None) or getattr(
+        request.state, "emit_event", None
+    )
+    if not callable(emit):
+        return
+
+    result = emit({"type": "data", "name": name, "data": data})
+    if isawaitable(result):
+        await result


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Make data-event emission best-effort.

If request.state.emit fails here after the revision was already committed, the API turns a successful mutation into a 5xx and invites duplicate retries. Swallow/log emitter failures instead of letting them escape the request.

coderabbitai · 2026-06-28T23:52:18Z

    if (
        tool_name is None
        and ptype.startswith("tool-")
-        and ptype != TOOL_OUTPUT_AVAILABLE
+        and ptype not in TOOL_OUTPUT_AVAILABLE_TYPES
    ):
        tool_name = ptype[len("tool-") :]

    blocks: List[ContentBlock] = []
-    if ptype != TOOL_OUTPUT_AVAILABLE or "input" in part:
+    if ptype not in TOOL_OUTPUT_AVAILABLE_TYPES or "input" in part:


🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Don’t synthesize tool names for generic result parts.

Line 100 only exempts tool-output-available. tool-output-error and tool-output-denied still fall through the tool- prefix path, so parts emitted by sdks/python/agenta/sdk/agents/adapters/vercel/stream.py:137-157 without toolName get parsed as tool_name="output-error" / "output-denied" and also produce a bogus tool_call at Line 105. That breaks the documented neutral tool_result mapping for non-success tool results.

Suggested fix

-TOOL_OUTPUT_AVAILABLE_TYPES = {TOOL_OUTPUT_AVAILABLE, "TOOL_OUTPUT_AVAILABLE"} +TOOL_OUTPUT_AVAILABLE_TYPES = {TOOL_OUTPUT_AVAILABLE, "TOOL_OUTPUT_AVAILABLE"} +GENERIC_TOOL_RESULT_TYPES = { + *TOOL_OUTPUT_AVAILABLE_TYPES, + "tool-output-error", + "tool-output-denied", +} @@ - if ( - tool_name is None - and ptype.startswith("tool-") - and ptype not in TOOL_OUTPUT_AVAILABLE_TYPES - ): + if ( + tool_name is None + and ptype.startswith("tool-") + and ptype not in GENERIC_TOOL_RESULT_TYPES + ): tool_name = ptype[len("tool-") :] @@ - if ptype not in TOOL_OUTPUT_AVAILABLE_TYPES or "input" in part: + if ptype not in GENERIC_TOOL_RESULT_TYPES or "input" in part:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (

tool_name is None

and ptype.startswith("tool-")

and ptype != TOOL_OUTPUT_AVAILABLE

and ptype not in TOOL_OUTPUT_AVAILABLE_TYPES

):

tool_name = ptype[len("tool-") :]

blocks: List[ContentBlock] = []

if ptype != TOOL_OUTPUT_AVAILABLE or "input" in part:

if ptype not in TOOL_OUTPUT_AVAILABLE_TYPES or "input" in part:

if (

tool_name is None

and ptype.startswith("tool-")

and ptype not in GENERIC_TOOL_RESULT_TYPES

):

tool_name = ptype[len("tool-") :]

blocks: List[ContentBlock] = []

if ptype not in GENERIC_TOOL_RESULT_TYPES or "input" in part:

coderabbitai · 2026-06-28T23:52:18Z

+def _is_request_connection_workflow(tool_config: ReferenceToolConfig) -> bool:
+    workflow = getattr(tool_config, "workflow", None)
+    if getattr(workflow, "slug", None) == REQUEST_CONNECTION_WORKFLOW_SLUG:
+        return True
+
+    call_ref = tool_config.call_ref
+    return (
+        call_ref == f"workflow.variant.{REQUEST_CONNECTION_WORKFLOW_SLUG}"
+        or call_ref.startswith(f"workflow.variant.{REQUEST_CONNECTION_WORKFLOW_SLUG}.")
+    )


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Handle environment-scoped request-connection references too.

Line 116 is checking tool_config.workflow.slug, but ReferenceToolConfig exposes slug directly. As written, the special case only matches workflow.variant... call refs, so ref_by="environment" tools targeting __ag__request_connection resolve as ordinary callback tools and bypass the client-tool flow.

Suggested fix

def _is_request_connection_workflow(tool_config: ReferenceToolConfig) -> bool: - workflow = getattr(tool_config, "workflow", None) - if getattr(workflow, "slug", None) == REQUEST_CONNECTION_WORKFLOW_SLUG: + if tool_config.slug == REQUEST_CONNECTION_WORKFLOW_SLUG: return True call_ref = tool_config.call_ref return ( call_ref == f"workflow.variant.{REQUEST_CONNECTION_WORKFLOW_SLUG}" or call_ref.startswith(f"workflow.variant.{REQUEST_CONNECTION_WORKFLOW_SLUG}.") )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _is_request_connection_workflow(tool_config: ReferenceToolConfig) -> bool:

workflow = getattr(tool_config, "workflow", None)

if getattr(workflow, "slug", None) == REQUEST_CONNECTION_WORKFLOW_SLUG:

return True

call_ref = tool_config.call_ref

return (

call_ref == f"workflow.variant.{REQUEST_CONNECTION_WORKFLOW_SLUG}"

or call_ref.startswith(f"workflow.variant.{REQUEST_CONNECTION_WORKFLOW_SLUG}.")

)

def _is_request_connection_workflow(tool_config: ReferenceToolConfig) -> bool:

if tool_config.slug == REQUEST_CONNECTION_WORKFLOW_SLUG:

return True

call_ref = tool_config.call_ref

return (

call_ref == f"workflow.variant.{REQUEST_CONNECTION_WORKFLOW_SLUG}"

or call_ref.startswith(f"workflow.variant.{REQUEST_CONNECTION_WORKFLOW_SLUG}.")

)

coderabbitai · 2026-06-28T23:52:18Z

+  private lookupClientTool(request: ClientToolRequest): { found: boolean; output?: unknown } {
+    for (const key of clientToolRequestKeys(request)) {
+      if (this.decisions.has(key)) {
+        const output = this.decisions.get(key);
+        if (!isPermissionDecision(output)) return { found: true, output };
+      }


🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift

Disambiguate permission decisions from client-tool outputs.

ApprovalDecisions now stores arbitrary client-tool outputs, but parkedCallResultOf() coerces any { approved: boolean } output into "allow"/"deny", and lookupClientTool() then drops stored "allow"/"deny" values as permission decisions. A valid client-tool output like { approved: false } or "deny" will not resume correctly. Store tagged entries, e.g. permission vs client-tool result, instead of using raw sentinel values.

Also applies to: 345-367

mmabrouk · 2026-06-29T07:45:17Z

@coderabbitai review

coderabbitai · 2026-06-29T07:45:23Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

mmabrouk · 2026-06-29T07:45:41Z

🤖 The AI agent says:

What to review on this PR:

_emit_data_event silent-failure in the tools and workflows routers. The helper exits silently if request.state.emit is absent. Confirm emit is present on request.state in every code path that calls commit_revision (both the tools router and the workflows router), or document the expected silent-failure contract.
Park-and-resume gap. Until the frontend slice lands, a kind: "client" tool call will park and never resume. The agent will hang silently. The PR body calls this out, but reviewers should confirm there is a timeout or error path in the runner that eventually unparks (or errors) a stalled client-tool call rather than blocking the run indefinitely.
TOOL_OUTPUT_AVAILABLE_TYPES set (messages.py). The change from a string equality check to a set membership check is small but widens what qualifies as a tool-output message. Confirm no other message type name could accidentally match one of these two strings.
_is_request_connection_workflow in workflow.py. The function checks both workflow.slug attribute and two call_ref string patterns. Confirm the call_ref patterns are exhaustive and won't miss a variant-pinned reference.

@ardaerzin

RUNNER: add onClientTool to Responder; detect client specs in permissions.ts and emit interaction_request(kind=client_tool) with park; remove the throw in dispatch.ts (return pending signal instead); rename approvalKey -> parkedCallKey everywhere; generalize extractApprovalDecisions to handle both {approved} and structured output. SDK: widen tool-output-available parsing in messages.py; add __ag__request_connection static workflow resolving to ClientToolSpec (kind=client, render={kind:connect}) in platform/workflow.py. API: register __ag__request_connection in static_catalog; emit data-committed-revision event after successful commit_revision in both tools/router.py and workflows/router.py. All checks: ruff format+check (API+SDK), tsc --noEmit, vitest (44 tests pass). Frontend (message-part dispatcher, connect widget, resume predicate, result UX, config refresh) is NOT included — owned by @ardaerzin.

mmabrouk added the needs-review Agent updated; awaiting Mahmoud's review label Jun 28, 2026

vercel Bot deployed to Preview June 28, 2026 22:19 View deployment

coderabbitai Bot reviewed Jun 28, 2026

View reviewed changes

mmabrouk force-pushed the feat/client-tool-roundtrip-4920 branch from be5bd45 to 76b89f4 Compare June 29, 2026 12:07

vercel Bot deployed to Preview June 29, 2026 12:08 View deployment

Uh oh!

Conversation

mmabrouk commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Changes

Scope / risk

Tests

How to QA

Uh oh!

vercel Bot commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

❌ Failed checks (1 warning)

Uh oh!

mmabrouk commented Jun 28, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk commented Jun 29, 2026

Uh oh!

coderabbitai Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mmabrouk commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mmabrouk commented Jun 28, 2026 •

edited

Loading

vercel Bot commented Jun 28, 2026 •

edited

Loading

coderabbitai Bot commented Jun 28, 2026 •

edited

Loading

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading