Skip to content

feat(agent): playground build kit (default agent config)#4926

Draft
mmabrouk wants to merge 1 commit into
big-agentsfrom
feat/playground-build-kit-4917
Draft

feat(agent): playground build kit (default agent config)#4926
mmabrouk wants to merge 1 commit into
big-agentsfrom
feat/playground-build-kit-4917

Conversation

@mmabrouk

Copy link
Copy Markdown
Member

Summary

Implements the playground build kit from the #4917 design: an agent-template overlay the backend serves read-only on the inspect response, the frontend applies to a kit-on playground run, and that never reaches a commit or a deployed run. The published default agent goes bare; the platform tools, authoring skill, and elevated sandbox now ride the overlay instead of the stored config.

This covers the full "Change set, by layer" from the design except Change 1 (collapsible advanced-drawer sections), which ships separately.

Implements the #4917 design; DRAFT for review tomorrow, do not merge.

What changed (file-by-file, from the Codex implementation summary)

Backend

  • api/oss/src/apis/fastapi/applications/models.py — added additional_context.playground_build_kit.agent_template_overlay to SimpleApplicationResponse.
  • api/oss/src/apis/fastapi/applications/router.py — populates the overlay only in fetch_simple_application (read path), never in create/edit/commit.
  • api/oss/src/apis/fastapi/applications/overlay.py (new) — deterministic overlay builder: tools from PLATFORM_OPS ({type: platform, op}) plus reserved-slug static workflows as @ag.embed references; skills as the @ag.embed of the authoring slug; sandbox.permissions build elevation.
  • services/oss/src/agent/schemas.py — reverted the enrichment so build_agent_v0_default() is bare (no authoring skill, no sandbox elevation).

Frontend

  • web/packages/agenta-entities/src/workflow/api/api.ts, state/store.ts — session atoms for the overlay and buildKitEnabled (default on).
  • web/packages/agenta-playground/src/state/execution/agentRequest.tsapplyBuildKitOverlay (deep-merge object fields, identity-merge list fields), called in buildAgentRequest only when the kit is on, on a throwaway run copy.
  • web/packages/agenta-entities/src/workflow/state/commit.ts — comment confirming commit reads only persisted entity parameters (overlay excluded for free).
  • web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/AgentTemplateControl.tsx — read-only "Playground build kit" section with enable/disable toggle, "Removed on commit" tag, locked overlay rendering, and a sandbox override hint on overridden user controls.

Tests

  • api/oss/tests/pytest/unit/applications/test_build_kit_overlay.py (new) — overlay builder shape; inspect response carries additional_context.playground_build_kit.
  • services/oss/tests/pytest/unit/agent/test_default_agent_template.py — published default is bare across builtin, inspect schema, catalog.
  • web/packages/agenta-playground/tests/unit/agentRequest.test.ts — kit-on merges overlay (deep + identity merge), kit-off sends bare config, commit excludes the kit, applier never mutates the input.

Verification (reported by Codex)

  • ruff format + ruff check --fix: passed.
  • cd web && pnpm lint-fix: passed.
  • Frontend typechecks (@agenta/entities, @agenta/playground, @agenta/entity-ui): passed.
  • API focused test: 3 passed. Services default-agent test: 5 passed. Playground vitest suite: 148 passed.
  • git diff --check: passed.

Notes for the reviewer

  • Implemented by Codex (gpt-5.5, xhigh) under a lean-driver orchestration; results above are Codex's own and have not been independently re-run in this branch's CI yet.
  • Change 1 (collapsible drawer sections) is intentionally out of scope here.

https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

@vercel

vercel Bot commented Jun 28, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 28, 2026 10:23pm

Request Review

@coderabbitai

coderabbitai Bot commented Jun 28, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: dcd060eb-8bb8-4a41-971d-d974deeda5fe

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/playground-build-kit-4917

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@mmabrouk mmabrouk added the needs-review Agent updated; awaiting Mahmoud's review label Jun 28, 2026
@mmabrouk

Copy link
Copy Markdown
Member Author

@Agenta-AI please review tomorrow — DRAFT, do not merge.

Specific feedback wanted:

  1. Inspect contract placement — is additional_context.playground_build_kit.agent_template_overlay on SimpleApplicationResponse the right home (vs. application.data / application.meta)? This is the load-bearing decision in the design.
  2. Overlay builder (api/oss/src/apis/fastapi/applications/overlay.py) — does iterating PLATFORM_OPS + reserved-slug static workflows, and embedding the authoring skill via @ag.embed, match the intended sources? Confirm the embed shapes match _ToolEmbedRefSchema / _SkillEmbedRefSchema.
  3. Frontend applier (applyBuildKitOverlay in agentRequest.ts) — verify the deep-merge (objects) / identity-merge (lists by op/slug/name) is correct and that it only touches the throwaway run copy, never the draft or commit tree.
  4. Bare published default — reverting the schemas.py enrichment touches the skills project's surface; confirm this is acceptable (design open question dashboard setup #1).

Implemented by Codex (gpt-5.5, xhigh). Change 1 (collapsible drawer sections) intentionally excluded.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (3)
services/oss/tests/pytest/unit/agent/test_default_agent_template.py (1)

59-71: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Mirror the sandbox-flag assertions for the builtin default.

The test name says every published default, but execute_code and write_files are only checked on inspect_default. A regression on the SDK builtin path would still pass this suite.

Proposed test tightening
     assert builtin_default["tools"] == []
     assert "permissions" not in builtin_default["sandbox"]
+    assert "execute_code" not in builtin_default["sandbox"]
+    assert "write_files" not in builtin_default["sandbox"]
     assert "skills" not in builtin_default
web/packages/agenta-entities/src/workflow/state/store.ts (1)

1106-1114: 🗄️ Data Integrity & Integration | 🔵 Trivial | ⚡ Quick win

Validate the overlay with a schema at the inspect boundary.

This only proves the top level is an object, so drift inside sandbox, tools, or skills will flow straight into the merge/UI path as AgentTemplate. Please run agent_template_overlay through a local Zod schema and safeParseWithLogging before exposing it from this atom family. Based on coding guidelines, "Keep Zod validation at the API boundary even when using Fern-generated types, because local schemas still detect backend drift" and "Use safeParseWithLogging from @agenta/entities/shared for boundary validation so structured errors are logged without crashing."

Source: Coding guidelines

web/packages/agenta-playground/tests/unit/agentRequest.test.ts (1)

285-305: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Assert the real commit source stays bare.

Line 302 checks workflowMolecule.selectors.configuration("e"), but prepareCommitParameters reads entity.data?.parameters. This test can still pass if build-kit fields start leaking into the actual commit payload later. Seed over.data with data.parameters and assert that object remains unchanged instead.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 38316881-acc2-4f09-88d6-97cbd6271674

📥 Commits

Reviewing files that changed from the base of the PR and between ebc4ec1 and fd5b777.

📒 Files selected for processing (20)
  • api/oss/src/apis/fastapi/applications/models.py
  • api/oss/src/apis/fastapi/applications/overlay.py
  • api/oss/src/apis/fastapi/applications/router.py
  • api/oss/tests/pytest/unit/applications/test_build_kit_overlay.py
  • docs/design/agent-workflows/projects/default-agent-config/README.md
  • docs/design/agent-workflows/projects/default-agent-config/design.md
  • docs/design/agent-workflows/projects/default-agent-config/research.md
  • docs/design/agent-workflows/projects/default-agent-config/status.md
  • services/oss/src/agent/schemas.py
  • services/oss/tests/pytest/unit/agent/test_default_agent_template.py
  • web/packages/agenta-entities/src/workflow/api/api.ts
  • web/packages/agenta-entities/src/workflow/index.ts
  • web/packages/agenta-entities/src/workflow/state/commit.ts
  • web/packages/agenta-entities/src/workflow/state/index.ts
  • web/packages/agenta-entities/src/workflow/state/store.ts
  • web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/AgentTemplateControl.tsx
  • web/packages/agenta-playground/src/state/execution/agentRequest.ts
  • web/packages/agenta-playground/src/state/execution/index.ts
  • web/packages/agenta-playground/src/state/index.ts
  • web/packages/agenta-playground/tests/unit/agentRequest.test.ts

Comment on lines +605 to +608
agent_template_overlay: Optional[dict] = Field(
default=None,
description="Partial `parameters.agent` overlay applied by the playground only.",
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Model agent_template_overlay explicitly instead of using dict.

This is now a backend/frontend contract, but Optional[dict] leaves the OpenAPI schema and runtime validation opaque. Please promote the overlay shape into concrete Pydantic models (or typed submodels for tools, skills, and sandbox) so downstream clients get a stable contract. As per coding guidelines, "Define explicit request and response models in models.py."

Source: Coding guidelines

Comment on lines +25 to +28
revision = catalog.retrieve_revision(slug=slug)
if revision and revision.flags and revision.flags.is_skill:
continue
slugs.append(slug)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Tighten the reserved-workflow filter.

This currently appends reserved slugs even when retrieve_revision() returns None or revision.flags is missing. The expected overlay only includes confirmed non-skill static workflows, so this can leak invalid tool embeds into the playground.

Suggested fix
         revision = catalog.retrieve_revision(slug=slug)
-        if revision and revision.flags and revision.flags.is_skill:
+        if not revision or not revision.flags or revision.flags.is_skill:
             continue
         slugs.append(slug)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
revision = catalog.retrieve_revision(slug=slug)
if revision and revision.flags and revision.flags.is_skill:
continue
slugs.append(slug)
revision = catalog.retrieve_revision(slug=slug)
if not revision or not revision.flags or revision.flags.is_skill:
continue
slugs.append(slug)

Comment on lines +1911 to +1917
additional_context=SimpleApplicationAdditionalContext(
playground_build_kit=PlaygroundBuildKitContext(
agent_template_overlay=build_agent_template_overlay(),
),
)
if simple_application
else None,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Don’t let build-kit synthesis blank the whole fetch response.

fetch_simple_application() is wrapped in @suppress_exceptions(default=SimpleApplicationResponse()). If build_agent_template_overlay() raises, this path now returns an empty 200 response instead of the fetched application. Build the overlay behind a local try/except and fall back to additional_context=None so the inspect path still works. As per path instructions, use @suppress_exceptions(...) only for controlled defaults.

Source: Path instructions

Comment on lines +635 to +645
function overriddenPermissionKeys(
userPermissions: Record<string, unknown> | null | undefined,
overlayPermissions: Record<string, unknown> | null | undefined,
): string[] {
if (!userPermissions || !overlayPermissions) return []
return Object.entries(overlayPermissions)
.filter(([key, overlayValue]) => {
if (!(key in userPermissions)) return false
return stableString(userPermissions[key]) !== stableString(overlayValue)
})
.map(([key]) => key)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Include overlay-added permissions in the override list.

The key in userPermissions guard drops permissions that the build kit adds on top of the draft. In the common write_files/new-network-key case, the warning next to SandboxPermissionControl under-reports the effective playground permissions.

Suggested fix
 function overriddenPermissionKeys(
     userPermissions: Record<string, unknown> | null | undefined,
     overlayPermissions: Record<string, unknown> | null | undefined,
 ): string[] {
-    if (!userPermissions || !overlayPermissions) return []
+    if (!overlayPermissions) return []
     return Object.entries(overlayPermissions)
-        .filter(([key, overlayValue]) => {
-            if (!(key in userPermissions)) return false
-            return stableString(userPermissions[key]) !== stableString(overlayValue)
-        })
+        .filter(
+            ([key, overlayValue]) =>
+                stableString(userPermissions?.[key]) !== stableString(overlayValue),
+        )
         .map(([key]) => key)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function overriddenPermissionKeys(
userPermissions: Record<string, unknown> | null | undefined,
overlayPermissions: Record<string, unknown> | null | undefined,
): string[] {
if (!userPermissions || !overlayPermissions) return []
return Object.entries(overlayPermissions)
.filter(([key, overlayValue]) => {
if (!(key in userPermissions)) return false
return stableString(userPermissions[key]) !== stableString(overlayValue)
})
.map(([key]) => key)
function overriddenPermissionKeys(
userPermissions: Record<string, unknown> | null | undefined,
overlayPermissions: Record<string, unknown> | null | undefined,
): string[] {
if (!overlayPermissions) return []
return Object.entries(overlayPermissions)
.filter(
([key, overlayValue]) =>
stableString(userPermissions?.[key]) !== stableString(overlayValue),
)
.map(([key]) => key)
}

Comment on lines +796 to +802
const agentTemplateOverlay = useAtomValue(
useMemo(() => workflowAgentTemplateOverlayAtomFamily(revisionId ?? ""), [revisionId]),
)
const [buildKitEnabled, setBuildKitEnabled] = useAtom(
useMemo(() => workflowBuildKitEnabledAtomFamily(revisionId ?? ""), [revisionId]),
)
const [buildKitExpanded, setBuildKitExpanded] = useState(true)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Cancel won't restore the build-kit toggle.

buildKitEnabled now lives outside the config snapshot, but cancelSection() only rolls back value. In drawer layouts, toggling this switch and pressing Cancel still changes later playground runs.

Suggested fix
+    const buildKitSnapshot = useRef<boolean | null>(null)
+
     const openSectionDrawer = useCallback(
         (key: "model-harness" | "advanced") => {
             sectionSnapshot.current = value ?? {}
+            buildKitSnapshot.current = key === "advanced" ? buildKitEnabled : null
             setOpenSection(key)
         },
-        [value],
+        [value, buildKitEnabled],
     )
     const cancelSection = useCallback(() => {
         if (sectionSnapshot.current) onChange(sectionSnapshot.current)
+        if (openSection === "advanced" && buildKitSnapshot.current != null) {
+            setBuildKitEnabled(buildKitSnapshot.current)
+        }
         setOpenSection(null)
-    }, [onChange])
+    }, [onChange, openSection, setBuildKitEnabled])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const agentTemplateOverlay = useAtomValue(
useMemo(() => workflowAgentTemplateOverlayAtomFamily(revisionId ?? ""), [revisionId]),
)
const [buildKitEnabled, setBuildKitEnabled] = useAtom(
useMemo(() => workflowBuildKitEnabledAtomFamily(revisionId ?? ""), [revisionId]),
)
const [buildKitExpanded, setBuildKitExpanded] = useState(true)
const agentTemplateOverlay = useAtomValue(
useMemo(() => workflowAgentTemplateOverlayAtomFamily(revisionId ?? ""), [revisionId]),
)
const [buildKitEnabled, setBuildKitEnabled] = useAtom(
useMemo(() => workflowBuildKitEnabledAtomFamily(revisionId ?? ""), [revisionId]),
)
const [buildKitExpanded, setBuildKitExpanded] = useState(true)
const buildKitSnapshot = useRef<boolean | null>(null)
const openSectionDrawer = useCallback(
(key: "model-harness" | "advanced") => {
sectionSnapshot.current = value ?? {}
buildKitSnapshot.current = key === "advanced" ? buildKitEnabled : null
setOpenSection(key)
},
[value, buildKitEnabled],
)
const cancelSection = useCallback(() => {
if (sectionSnapshot.current) onChange(sectionSnapshot.current)
if (openSection === "advanced" && buildKitSnapshot.current != null) {
setBuildKitEnabled(buildKitSnapshot.current)
}
setOpenSection(null)
}, [onChange, openSection, setBuildKitEnabled])

Comment on lines +1646 to +1677
{hasBuildKitOverlay ? (
<div className="rounded border border-solid border-[var(--ag-c-EAEFF5,#eaeff5)] bg-[#fcfcfa]">
<button
type="button"
onClick={() => setBuildKitExpanded((open) => !open)}
className="flex w-full cursor-pointer items-center gap-2 border-0 bg-transparent px-3 py-2.5 text-left"
>
<Wrench size={15} className="text-[var(--ag-c-586673,#586673)]" />
<span className="text-[13px] font-medium">Playground build kit</span>
<span className="ml-auto inline-flex items-center gap-1.5 text-[11px] text-[var(--ag-c-586673,#586673)]">
<span className="h-1.5 w-1.5 rounded-full bg-[#d97706]" />
Removed on commit
</span>
<span
onClick={(e) => e.stopPropagation()}
className="inline-flex items-center"
>
<Switch
size="small"
checked={buildKitEnabled}
onChange={setBuildKitEnabled}
disabled={disabled}
/>
</span>
<CaretRight
size={14}
className={cn(
"text-[var(--ag-c-97A4B0,#97a4b0)] transition-transform",
buildKitExpanded && "rotate-90",
)}
/>
</button>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

file='web/packages/agenta-entity-ui/src/DrillInView/SchemaControls/AgentTemplateControl.tsx'

# Show the relevant region with line numbers.
sed -n '1618,1692p' "$file"

# Map the component structure around the snippet.
grep -nE '<button|</button>|<Switch|onClick=|stopPropagation' "$file" | sed -n '1,120p'

Repository: Agenta-AI/agenta

Length of output: 5513


Move the switch out of the header button
Switch is an interactive control, so nesting it inside the clickable <button> creates invalid interactive content and can break keyboard/focus handling. Keep the header toggle and the switch as separate controls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-review Agent updated; awaiting Mahmoud's review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant