Skip to content

fix(agent): park HITL approvals + keep ACP alive + fail loud on code tools#4848

Merged
mmabrouk merged 2 commits into
big-agentsfrom
feat/agent-hitl-park-codetool-failloud
Jun 25, 2026
Merged

fix(agent): park HITL approvals + keep ACP alive + fail loud on code tools#4848
mmabrouk merged 2 commits into
big-agentsfrom
feat/agent-hitl-park-codetool-failloud

Conversation

@mmabrouk

Copy link
Copy Markdown
Member

Context

Human-in-the-loop tool approval was broken for Claude (F-024). The permission gate fired, but the playground never showed an Approve/Deny prompt, and a parked approval that a human did eventually grant then timed out.

Two root causes:

  1. The runner parked an ask permission gate by replying reject to the harness. For Claude that produced a failed tool call ("User refused permission") whose tool_result{isError} overwrote the approval prompt on the same tool-call id. The prompt the user needed to act on was clobbered by an error.
  2. A parked turn holds the ACP HTTP connection open while the human decides. The runner set no custom headersTimeout, so undici's ~5 minute default fired and killed the ACP stream with UND_ERR_HEADERS_TIMEOUT. The approve-then-resume turn then timed out too.

Changes

Park instead of reject (F-024). Add a third responder outcome, park. HITLResponder returns park on a human surface with no stored decision (it used to return deny), and attachPermissionResponder sends no respondPermission at all on park. The interaction_request stays the last word on the tool call. The turn ends with the tool pending, and the next turn's stored decision resolves it. This is runner-internal, so there is no wire change.

Before: park -> reply reject -> Claude emits a failed tool call -> tool_result{isError} overwrites the approval prompt.
After: park -> no reply -> the approval prompt stays on the tool call -> the next turn's decision resolves it.

Keep the ACP connection alive across a park. New acp-fetch.ts drives the ACP HTTP client through a long-timeout undici dispatcher (createAcpDispatcher / createAcpFetch), with headersTimeout / bodyTimeout disabled by default and overridable via SANDBOX_AGENT_ACP_*_TIMEOUT_MS. The local path passes createAcpFetch(); the Daytona cookie fetch layers on the same dispatcher. This is scoped to the ACP fetch, not the global dispatcher. undici is promoted to a direct dependency.

Fail loud on code tools (F-016). buildRunPlan now refuses a run carrying a kind: code tool up front with CODE_TOOL_UNSUPPORTED_MESSAGE (ok: false), the same way stdio MCP is gated. Code execution was removed for security (F-010); the per-call runCodeTool throw used to become a tool result the model could launder into a 200 "success". The per-call throw stays as a defense-in-depth backstop.

Hide the Permission policy field for Pi (Pi-1). Pi never gates tool use, so a permission policy is meaningless for it. AgentConfigControl hides the field for pi_core / pi_agenta. Only Claude honors it.

Scope / risk

Park is runner-internal, so there is no /run wire, protocol, SDK, or golden-fixture change for it. The ACP timeout change is scoped to the ACP fetch dispatcher, not the process-global undici dispatcher, so other HTTP traffic in the runner is unaffected. The code-tool gate mirrors the existing stdio-MCP gate; it only affects runs that carry a kind: code tool. The FE change only removes a control for Pi harnesses; Claude's config form is unchanged.

This PR is stacked on feat/agent-gateway-tool-mcp (the gateway-MCP PR). Review and merge that first.

Tests / notes

  • Runner park unit test plus the orchestration test updated to assert park / no-reply.
  • New permissions.ts park regression and new run-plan code-tool gate tests.
  • New sandbox-agent-acp-fetch test for the long-timeout dispatcher.
  • SDK egress test locking "park does not clobber".
  • 221 vitest + typecheck in services/agent, 348 SDK agent tests green. @agenta/entity-ui typecheck + eslint clean for the FE change.

How to QA

Prerequisites: the EE dev stack with the sandbox-agent sidecar, a project with an Anthropic key, and an agent config with harness: claude and a tool whose permission policy is set to ask/prompt.

Steps:

  1. In the playground, start a Claude agent chat that triggers the gated tool.
  2. Wait for the approval prompt to appear in the conversation.
  3. Leave it pending for more than a few minutes, then click Approve.

Expected result: the Approve/Deny prompt renders on the tool call (it is not replaced by a "User refused permission" error). After you approve, even minutes later, the run resumes and completes instead of timing out.

Automated tests:

cd services/agent && pnpm test -- --run responder sandbox-agent-permissions sandbox-agent-run-plan sandbox-agent-orchestration sandbox-agent-acp-fetch

Edge cases: open the agent config for a Pi harness (pi_core / pi_agenta) and confirm the Permission policy field is hidden, while Claude still shows it. Also confirm a run carrying a kind: code tool now fails up front with a clear message rather than reporting a fake success.

Refs #4845 (plan).

mmabrouk added 2 commits June 25, 2026 20:27
FIX 1 (HITL park, F-024): the runner parked an `ask` permission gate by replying
`reject` to the harness, which for Claude produced a failed tool call ("User
refused permission") whose tool_result{isError} overwrote the approval prompt on
the same tool-call id. Add a third responder outcome `park`: HITLResponder returns
it on a human surface with no stored decision (was `deny`), and
attachPermissionResponder sends NO respondPermission on park. The interaction_request
stays the last word on the tool call; the turn ends with the tool pending and the
next turn's stored decision resolves it. No wire change (park is runner-internal).
Pi-1: hide the Permission policy field for Pi in AgentConfigControl (Pi never gates).

FIX 2 (code-tool fail loud, F-016): buildRunPlan now refuses a run carrying a
`kind: code` tool up front with CODE_TOOL_UNSUPPORTED_MESSAGE (ok:false), the way
stdio MCP is gated. Code execution was removed for security (F-010); the per-call
runCodeTool throw became a tool result the model laundered into a 200 "success".
The per-call throw stays as a defense-in-depth backstop.

Tests: runner park unit + orchestration updated to assert park/no-reply; new
permissions.ts park regression; new run-plan code-tool gate tests; SDK egress test
locking "park does not clobber". 221 vitest + typecheck, 348 SDK agent tests green.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
A parked HITL turn holds the acp-http-client connection open while a human
approves a tool, but the runner set no custom headersTimeout, so undici's
default (~5 min) fired and killed the ACP stream with UND_ERR_HEADERS_TIMEOUT;
the approve->resume turn then timed out too.

Drive the ACP HTTP client through a long-timeout undici dispatcher (new
acp-fetch.ts: createAcpDispatcher/createAcpFetch, headersTimeout/bodyTimeout
disabled by default, overridable via SANDBOX_AGENT_ACP_*_TIMEOUT_MS). The local
path now passes createAcpFetch(); the Daytona cookie fetch layers on the same
dispatcher. Scoped to the ACP fetch, not the global dispatcher. undici promoted
to a direct dependency.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
@vercel

vercel Bot commented Jun 25, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 25, 2026 6:33pm

Request Review

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 25, 2026
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added support for pausing permission flows when human approval is needed, avoiding premature deny/reject responses.
    • Improved long-running approval interactions so they stay connected instead of timing out.
  • Bug Fixes

    • Prevented tool approval requests from being overwritten by error/denied messages.
    • Rejected unsupported code-tool runs up front with a clear failure state.
  • UI

    • Hid the permission policy setting for Pi-based harness configurations where it doesn’t apply.

Walkthrough

The agent responder now parks permission requests instead of replying immediately in human-surface cases. ACP fetch handling uses timeout-aware undici dispatchers, code-tool runs are rejected in run planning, and the Pi harness hides the permission-policy control.

Changes

Agent permission and tool flow

Layer / File(s) Summary
Parked permission outcome
services/agent/src/responder.ts, services/agent/tests/unit/responder.test.ts
ResponderOutcome adds park, HITLResponder returns it for human-surface requests without stored decisions, and the responder unit test asserts the new outcome.
Permission request wiring
services/agent/src/engines/sandbox_agent/permissions.ts, services/agent/tests/unit/sandbox-agent-permissions.test.ts, services/agent/tests/unit/sandbox-agent-orchestration.test.ts, sdks/python/oss/tests/pytest/unit/agents/test_ui_messages.py
attachPermissionResponder skips harness replies for park, and the permission/orchestration/message-stream tests assert emitted interaction requests with no permission reply or tool-output error chunks.
ACP fetch dispatcher
services/agent/package.json, services/agent/src/engines/sandbox_agent/acp-fetch.ts, services/agent/tests/unit/sandbox-agent-acp-fetch.test.ts
ACP fetch helpers parse timeout environment variables, build an undici dispatcher and fetch wrapper, add the undici dependency, and the ACP fetch tests cover timeout defaults and overrides.
Sandbox fetch wiring
services/agent/src/engines/sandbox_agent.ts, services/agent/src/engines/sandbox_agent/daytona.ts, services/agent/tests/unit/sandbox-agent-daytona.test.ts
SandboxAgent and Daytona now inject ACP-aware fetch implementations, and the Daytona fetch test passes a custom inner fetch into the cookie wrapper.
Code-tool rejection
services/agent/src/engines/sandbox_agent/run-plan.ts, services/agent/src/tools/code.ts, services/agent/tests/unit/sandbox-agent-run-plan.test.ts
buildRunPlan rejects resolved code tools, the code-tool module comment describes that gate, and run-plan tests cover rejection and callback-tool acceptance.
Pi harness permission policy
web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/AgentConfigControl.tsx
AgentConfigControl derives an Pi-harness flag and uses it to hide the permission-policy control for Pi harnesses.

Sequence Diagram(s)

Parked permission flow

sequenceDiagram
  participant HITLResponder
  participant attachPermissionResponder
  participant session
  attachPermissionResponder->>HITLResponder: onPermission(request)
  HITLResponder-->>attachPermissionResponder: "park"
  attachPermissionResponder->>session: emit interaction_request
  Note over attachPermissionResponder,session: session.respondPermission() is skipped
Loading

ACP fetch wiring

sequenceDiagram
  participant SandboxAgent
  participant createCookieFetch
  participant createAcpFetch
  participant undici.fetch
  SandboxAgent->>createCookieFetch: Daytona start()
  createCookieFetch->>createAcpFetch: default inner fetch()
  createCookieFetch->>undici.fetch: request with cookies
  SandboxAgent->>createAcpFetch: non-Daytona start()
  createAcpFetch->>undici.fetch: fetch(..., dispatcher)
  Note over createAcpFetch,undici.fetch: dispatcher uses ACP timeout settings
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Agenta-AI/agenta#4778: Also updates the responder and permission-handling seam with park-related behavior.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.45% which is insufficient. The required threshold is 60.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main changes: HITL parking, ACP keep-alive, and code-tool rejection.
Description check ✅ Passed The description is directly related to the patch and accurately describes the runner, ACP, code-tool, and UI changes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/agent-hitl-park-codetool-failloud

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@dosubot dosubot Bot added the Backend label Jun 25, 2026
@mmabrouk

Copy link
Copy Markdown
Member Author

@coderabbitai review

Review focus: this fixes the broken Claude HITL approval flow (F-024) plus two related runner hardening changes. Please confirm:

  1. Park correctness: the new park responder outcome sends NO respondPermission, so the interaction_request stays the last word on the tool call and is not overwritten by a tool_result{isError}. Check the next-turn resume path resolves the parked tool from the stored decision.
  2. The ACP keep-alive (acp-fetch.ts long-timeout undici dispatcher) is scoped to the ACP fetch only, not the process-global dispatcher, and the env overrides (SANDBOX_AGENT_ACP_*_TIMEOUT_MS) are wired on both the local and Daytona paths.
  3. The code-tool fail-loud gate (CODE_TOOL_UNSUPPORTED_MESSAGE in buildRunPlan) mirrors the stdio-MCP gate and the per-call throw remains as a backstop.
  4. The FE change (hide Permission policy for pi_core/pi_agenta) does not affect Claude's config form.

Stacked on #4847; its diff is isolated to these 17 files.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== PR metadata =="
gh pr view 4848 --json number,title,baseRefName,headRefName,files --jq '{number,title,baseRefName,headRefName,files:[.files[].path]}'

echo
echo "== Changed files =="
gh pr view 4848 --json files --jq '.files[].path'

echo
echo "== Search for relevant symbols =="
rg -n -C2 --glob '!**/node_modules/**' \
  -e '\bpark\b' \
  -e 'respondPermission' \
  -e 'interaction_request' \
  -e 'createAcpDispatcher' \
  -e 'createAcpFetch' \
  -e 'SANDBOX_AGENT_ACP_[A-Z_]+_TIMEOUT_MS' \
  -e 'CODE_TOOL_UNSUPPORTED_MESSAGE' \
  -e '\bbuildRunPlan\b' \
  -e '\brunCodeTool\b' \
  -e '\bAgentConfigControl\b' \
  -e 'pi_core' \
  -e 'pi_agenta' .

echo
echo "== Candidate files =="
fd -i 'acp-fetch\.ts$' .
fd -i 'permissions\.ts$' .
fd -i 'run-plan.*\.(ts|tsx)$' .
fd -i 'AgentConfigControl\.(ts|tsx)$' .
fd -i 'responder.*\.(ts|tsx)$' .
fd -i 'orchestration.*\.(ts|tsx)$' .

for f in \
  $(fd -i 'acp-fetch\.ts$' .) \
  $(fd -i 'permissions\.ts$' .) \
  $(fd -i 'run-plan.*\.(ts|tsx)$' .) \
  $(fd -i 'AgentConfigControl\.(ts|tsx)$' .) \
  $(fd -i 'responder.*\.(ts|tsx)$' .) \
  $(fd -i 'orchestration.*\.(ts|tsx)$' .)
do
  echo
  echo "== Outline: $f =="
  ast-grep outline "$f" --view expanded || true
done

echo
echo "== Targeted source excerpts =="
python - <<'PY'
from pathlib import Path
targets = {
    "services/agent": [
        "acp-fetch.ts",
        "permissions.ts",
        "run-plan.ts",
        "responder.ts",
        "orchestration.ts",
    ],
    "frontend": [
        "AgentConfigControl.tsx",
        "AgentConfigControl.ts",
    ],
}
for base, names in targets.items():
    for name in names:
        for p in Path(".").rglob(name):
            if base in str(p):
                print(f"\n===== {p} =====")
                try:
                    lines = p.read_text().splitlines()
                except Exception as e:
                    print(f"[read error] {e}")
                    continue
                for i, line in enumerate(lines, start=1):
                    if any(tok in line for tok in [
                        "park", "respondPermission", "interaction_request",
                        "createAcpDispatcher", "createAcpFetch",
                        "SANDBOX_AGENT_ACP_", "CODE_TOOL_UNSUPPORTED_MESSAGE",
                        "buildRunPlan", "runCodeTool", "AgentConfigControl",
                        "pi_core", "pi_agenta"
                    ]):
                        start = max(1, i-12)
                        end = min(len(lines), i+20)
                        print(f"\n--- lines {start}-{end} ---")
                        for n in range(start, end+1):
                            print(f"{n:4}: {lines[n-1]}")
PY

echo
echo "== Tests mentioning park / ACP / code-tool gate / Pi control =="
rg -n -C2 --glob '!**/node_modules/**' \
  -e 'park does not clobber' \
  -e 'headersTimeout' \
  -e 'bodyTimeout' \
  -e 'CODE_TOOL_UNSUPPORTED_MESSAGE' \
  -e 'pi_core' \
  -e 'pi_agenta' \
  -e 'respondPermission' \
  -e '\bpark\b' services packages apps . 2>/dev/null || true

Length of output: 241621


🏁 Script executed:

#!/bin/bash
set -euo pipefail

files=(
  "services/agent/src/engines/sandbox_agent/acp-fetch.ts"
  "services/agent/src/engines/sandbox_agent.ts"
  "services/agent/src/engines/sandbox_agent/daytona.ts"
  "services/agent/src/engines/sandbox_agent/permissions.ts"
  "services/agent/src/engines/sandbox_agent/run-plan.ts"
  "services/agent/src/tools/code.ts"
  "services/agent/src/responder.ts"
  "services/agent/tests/unit/sandbox-agent-orchestration.test.ts"
  "services/agent/tests/unit/sandbox-agent-run-plan.test.ts"
  "services/agent/tests/unit/responder.test.ts"
  "web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/AgentConfigControl.tsx"
)

for f in "${files[@]}"; do
  if [[ -f "$f" ]]; then
    echo
    echo "===== $f ====="
    nl -ba "$f" | sed -n '1,260p'
  else
    echo "MISSING: $f"
  fi
done

Length of output: 1260


🏁 Script executed:

#!/bin/bash
set -euo pipefail

show_matches() {
  local file="$1"
  shift
  echo
  echo "===== $file ====="
  for pat in "$@"; do
    echo "--- pattern: $pat ---"
    awk -v pat="$pat" '
      index($0, pat) {
        start = NR - 12; if (start < 1) start = 1
        end = NR + 20
        hits[++n] = start ":" end
      }
      { lines[NR] = $0; max = NR }
      END {
        if (n == 0) {
          print "[no matches]"
          exit
        }
        lastEnd = 0
        for (i = 1; i <= n; i++) {
          split(hits[i], a, ":")
          s = a[1]; e = a[2]
          if (s <= lastEnd) s = lastEnd + 1
          if (e > max) e = max
          if (s > e) continue
          printf("lines %d-%d\n", s, e)
          for (j = s; j <= e; j++) printf("%4d: %s\n", j, lines[j])
          lastEnd = e
        }
      }
    ' "$file"
  done
}

show_matches "services/agent/src/engines/sandbox_agent/acp-fetch.ts" \
  "createAcpDispatcher" \
  "createAcpFetch" \
  "SANDBOX_AGENT_ACP_HEADERS_TIMEOUT_MS" \
  "SANDBOX_AGENT_ACP_BODY_TIMEOUT_MS" \
  "setGlobalDispatcher"

show_matches "services/agent/src/engines/sandbox_agent.ts" \
  "createAcpFetch()" \
  "createCookieFetch()" \
  "fetch:" \
  "headersTimeout"

show_matches "services/agent/src/engines/sandbox_agent/daytona.ts" \
  "createCookieFetch" \
  "createAcpFetch()" \
  "fetch(input" \
  "cookie"

show_matches "services/agent/src/engines/sandbox_agent/permissions.ts" \
  "decision === \"park\"" \
  "respondPermission" \
  "interaction_request"

show_matches "services/agent/src/responder.ts" \
  "ResponderOutcome" \
  "if (this.hasHumanSurface) return \"park\"" \
  "private lookup(request" \
  "toolCallId" \
  "toolName"

show_matches "services/agent/tests/unit/responder.test.ts" \
  "parks when there is a human surface and no stored decision" \
  "stored decision wins over the park" \
  "tool name"

show_matches "services/agent/tests/unit/sandbox-agent-orchestration.test.ts" \
  "runSandboxAgent default HITL responder wiring" \
  "parks unapproved" \
  "stored decision" \
  "permissionReplies" \
  "interaction_request"

show_matches "services/agent/src/engines/sandbox_agent/run-plan.ts" \
  "CODE_TOOL_UNSUPPORTED_MESSAGE" \
  "USER_MCP_UNSUPPORTED_MESSAGE" \
  "hasCodeTool(toolSpecs)" \
  "return { ok: false, error: CODE_TOOL_UNSUPPORTED_MESSAGE }"

show_matches "services/agent/src/tools/code.ts" \
  "CODE_TOOL_UNSUPPORTED_MESSAGE" \
  "throw new Error"

show_matches "services/agent/tests/unit/sandbox-agent-run-plan.test.ts" \
  "CODE_TOOL_UNSUPPORTED_MESSAGE" \
  "kind: \"code\"" \
  "ok: false"

show_matches "web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/AgentConfigControl.tsx" \
  "isPiHarness" \
  "permission_policy" \
  "Permission policy" \
  "claude"

Length of output: 86470



@mmabrouk I reviewed the four focus areas and don’t see any blocking issues in the scoped changes.

  1. Park correctness looks right

    • services/agent/src/responder.ts, Line 44 defines park as a real responder outcome, and HITLResponder returns a stored decision first, otherwise park when a human surface exists (Line 108Line 112).
    • The resume path is wired through stored approvals keyed by both toolCallId and toolName, which handles cold-replay id churn (services/agent/src/responder.ts, Line 124Line 133, Line 148Line 159).
    • services/agent/src/engines/sandbox_agent/permissions.ts emits the interaction_request first, then explicitly does no respondPermission on park (Line 24Line 49), so the approval prompt remains the last word on the tool call.
    • Tests cover both sides of this:
      • park/no-reply: services/agent/tests/unit/sandbox-agent-orchestration.test.ts, Line 639Line 668
      • resume from stored approval: services/agent/tests/unit/sandbox-agent-orchestration.test.ts, Line 671Line 705
  2. ACP keep-alive is scoped correctly

    • services/agent/src/engines/sandbox_agent/acp-fetch.ts builds a dedicated undici Agent with SANDBOX_AGENT_ACP_HEADERS_TIMEOUT_MS and SANDBOX_AGENT_ACP_BODY_TIMEOUT_MS (Line 32Line 40) and wraps undici fetch with that dispatcher (Line 48Line 50).
    • I did not see any global dispatcher mutation (setGlobalDispatcher had no matches), so this remains ACP-fetch scoped.
    • The local/Daytona wiring is consistent:
      • local path uses createAcpFetch(): services/agent/src/engines/sandbox_agent.ts, Line 245Line 247
      • Daytona path uses createCookieFetch(): same lines
      • createCookieFetch() itself layers on createAcpFetch() by default: services/agent/src/engines/sandbox_agent/daytona.ts, Line 161Line 177
    • The unit test coverage you called out is also present from the earlier grep: defaults + env overrides for headersTimeout / bodyTimeout.
  3. Code-tool fail-loud gate matches the stdio-MCP pattern

    • services/agent/src/engines/sandbox_agent/run-plan.ts rejects runs carrying kind: code tools up front with CODE_TOOL_UNSUPPORTED_MESSAGE (Line 205Line 213).
    • That gate sits immediately next to, and structurally mirrors, the stdio MCP gate (Line 215Line 220).
    • The defense-in-depth backstop remains in place: services/agent/src/tools/code.ts still throws CODE_TOOL_UNSUPPORTED_MESSAGE in runCodeTool() (Line 17Line 18, Line 24Line 31).
    • There’s also a run-plan test asserting kind: "code" now yields ok: false before further setup work: services/agent/tests/unit/sandbox-agent-run-plan.test.ts, Line 272Line 302.
  4. FE change does not affect Claude’s config form

    • web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/AgentConfigControl.tsx computes isPiHarness only for pi_core / pi_agenta (Line 130Line 133).
    • The Permission policy field is hidden only when isPiHarness is true (Line 684Line 694), so Claude still gets the control.
    • Claude-specific advanced permissions are still explicitly gated on harness === "claude" (Line 709Line 727).

Overall, the implementation matches the stated intent for F-024, the ACP timeout fix stays narrowly scoped, the F-016 code-tool gate is fail-loud with the old throw preserved as a backstop, and the FE change is isolated to Pi.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@mmabrouk mmabrouk changed the base branch from feat/agent-gateway-tool-mcp to big-agents June 25, 2026 18:39
@mmabrouk mmabrouk merged commit 07f44ac into big-agents Jun 25, 2026
33 of 34 checks passed
@github-actions

Copy link
Copy Markdown
Contributor

Railway Preview Environment

Image tag pr-4848-f017dfb
Status Failed
Railway logs Open logs
Logs View workflow run
Updated at 2026-06-25T18:40:08.508Z

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Backend size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant