[integration] big-agents#4791
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Important Review skippedToo many files! This PR contains 1127 files, which is 977 over the limit of 150. To get a review, narrow the scope: Upgrade to a paid plan to raise the limit. ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: ⛔ Files ignored due to path filters (152)
📒 Files selected for processing (1127)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 10
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 76c33a7d-feff-4e5f-acc0-962498f74cfc
📒 Files selected for processing (70)
sdks/python/agenta/__init__.pysdks/python/agenta/sdk/agents/__init__.pysdks/python/agenta/sdk/agents/adapters/__init__.pysdks/python/agenta/sdk/agents/adapters/_runner_config.pysdks/python/agenta/sdk/agents/adapters/agenta_builtins.pysdks/python/agenta/sdk/agents/adapters/harnesses.pysdks/python/agenta/sdk/agents/adapters/in_process.pysdks/python/agenta/sdk/agents/adapters/local.pysdks/python/agenta/sdk/agents/adapters/sandbox_agent.pysdks/python/agenta/sdk/agents/adapters/vercel/__init__.pysdks/python/agenta/sdk/agents/adapters/vercel/messages.pysdks/python/agenta/sdk/agents/adapters/vercel/routing.pysdks/python/agenta/sdk/agents/adapters/vercel/sse.pysdks/python/agenta/sdk/agents/adapters/vercel/stream.pysdks/python/agenta/sdk/agents/dtos.pysdks/python/agenta/sdk/agents/errors.pysdks/python/agenta/sdk/agents/interfaces.pysdks/python/agenta/sdk/agents/mcp/__init__.pysdks/python/agenta/sdk/agents/mcp/errors.pysdks/python/agenta/sdk/agents/mcp/interfaces.pysdks/python/agenta/sdk/agents/mcp/models.pysdks/python/agenta/sdk/agents/mcp/parsing.pysdks/python/agenta/sdk/agents/mcp/resolver.pysdks/python/agenta/sdk/agents/mcp/wire.pysdks/python/agenta/sdk/agents/streaming.pysdks/python/agenta/sdk/agents/tools/__init__.pysdks/python/agenta/sdk/agents/tools/compat.pysdks/python/agenta/sdk/agents/tools/errors.pysdks/python/agenta/sdk/agents/tools/interfaces.pysdks/python/agenta/sdk/agents/tools/models.pysdks/python/agenta/sdk/agents/tools/parsing.pysdks/python/agenta/sdk/agents/tools/resolver.pysdks/python/agenta/sdk/agents/tools/wire.pysdks/python/agenta/sdk/agents/ui_messages.pysdks/python/agenta/sdk/agents/utils/__init__.pysdks/python/agenta/sdk/agents/utils/ts_runner.pysdks/python/agenta/sdk/agents/utils/wire.pysdks/python/agenta/sdk/decorators/routing.pysdks/python/agenta/sdk/engines/running/interfaces.pysdks/python/agenta/sdk/engines/running/utils.pysdks/python/agenta/sdk/middlewares/running/normalizer.pysdks/python/agenta/sdk/models/workflows.pysdks/python/agenta/sdk/utils/types.pysdks/python/agenta/tests/agents/test_streaming.pysdks/python/oss/tests/pytest/integration/agents/__init__.pysdks/python/oss/tests/pytest/integration/agents/test_transport_roundtrip.pysdks/python/oss/tests/pytest/unit/agents/__init__.pysdks/python/oss/tests/pytest/unit/agents/conftest.pysdks/python/oss/tests/pytest/unit/agents/golden/run_request.claude.jsonsdks/python/oss/tests/pytest/unit/agents/golden/run_request.pi.jsonsdks/python/oss/tests/pytest/unit/agents/golden/run_result.error.jsonsdks/python/oss/tests/pytest/unit/agents/golden/run_result.ok.jsonsdks/python/oss/tests/pytest/unit/agents/mcp/__init__.pysdks/python/oss/tests/pytest/unit/agents/mcp/test_resolver.pysdks/python/oss/tests/pytest/unit/agents/test_dtos_agent_config.pysdks/python/oss/tests/pytest/unit/agents/test_dtos_capabilities_events.pysdks/python/oss/tests/pytest/unit/agents/test_dtos_content_blocks.pysdks/python/oss/tests/pytest/unit/agents/test_dtos_harness_configs.pysdks/python/oss/tests/pytest/unit/agents/test_environment_lifecycle.pysdks/python/oss/tests/pytest/unit/agents/test_harness_adapters.pysdks/python/oss/tests/pytest/unit/agents/test_runner_adapter_config.pysdks/python/oss/tests/pytest/unit/agents/test_ui_messages.pysdks/python/oss/tests/pytest/unit/agents/test_wire_contract.pysdks/python/oss/tests/pytest/unit/agents/tools/__init__.pysdks/python/oss/tests/pytest/unit/agents/tools/test_models.pysdks/python/oss/tests/pytest/unit/agents/tools/test_parsing.pysdks/python/oss/tests/pytest/unit/agents/tools/test_resolver.pysdks/python/oss/tests/pytest/unit/test_normalizer_passthrough.pysdks/python/oss/tests/pytest/utils/test_messages_endpoint.pysdks/python/oss/tests/pytest/utils/test_routing.py
| NOTE on packaging: the Node runner is NOT part of this Python wheel (``pip install agenta`` | ||
| stays pure Python; the wheel contains zero ``.ts``/``.js``). How a standalone Pi user obtains | ||
| the runner -- an ``npx`` npm package, a local checkout, or a Docker sidecar over HTTP -- is an | ||
| open distribution decision; see ``docs/design/agent-workflows/typescript-structure/``. Do NOT | ||
| silently bundle a JS runner into the wheel. |
There was a problem hiding this comment.
Align LocalBackend wording with the stated packaging contract.
Line 9-13 says the wheel must not bundle a JS runner, but Line 30 and the NotImplementedError messages still say “bundled JS”. This contradiction will confuse integrators.
Suggested wording fix
-class LocalBackend(Backend):
- """Run Pi (bundled JS) or Claude (``claude-agent-sdk``) on this machine."""
+class LocalBackend(Backend):
+ """Run Pi (external Node runner) or Claude (``claude-agent-sdk``) on this machine."""
...
raise NotImplementedError(
- "LocalBackend is not implemented yet (Phase 3: Pi via bundled JS, "
+ "LocalBackend is not implemented yet (Phase 3: Pi via external Node runner, "
"Phase 4: Claude via claude-agent-sdk)."
)
...
raise NotImplementedError(
- "LocalBackend is not implemented yet (Phase 3: Pi via bundled JS, "
+ "LocalBackend is not implemented yet (Phase 3: Pi via external Node runner, "
"Phase 4: Claude via claude-agent-sdk)."
)Also applies to: 30-38, 50-53
| def __init__( | ||
| self, | ||
| *, | ||
| sandbox: str = "local", | ||
| url: Optional[str] = None, | ||
| command: Optional[Sequence[str]] = None, | ||
| cwd: Optional[str] = None, | ||
| timeout: float = float(os.getenv("AGENTA_AGENT_RUNNER_TIMEOUT_SECONDS", "180")), | ||
| ) -> None: | ||
| self._sandbox = sandbox | ||
| self._url = url |
There was a problem hiding this comment.
Validate sandbox at construction time.
Line 129 currently accepts any string; invalid values get sent over the wire and fail late. Restrict this to supported values (local, daytona) and raise a configuration error early.
Suggested validation
from ..dtos import (
@@
)
+from ..errors import AgentRunnerConfigurationError
@@
def __init__(
self,
*,
sandbox: str = "local",
@@
timeout: float = float(os.getenv("AGENTA_AGENT_RUNNER_TIMEOUT_SECONDS", "180")),
) -> None:
+ allowed_sandboxes = {"local", "daytona"}
+ if sandbox not in allowed_sandboxes:
+ raise AgentRunnerConfigurationError(
+ f"Unsupported sandbox '{sandbox}'. Expected one of: {sorted(allowed_sandboxes)}."
+ )
self._sandbox = sandbox
self._url = url| from agenta.sdk.agents.tools.models import MissingSecretPolicy | ||
|
|
||
| from .errors import MissingMCPSecretError | ||
| from .interfaces import MCPSecretProvider | ||
| from .models import MCPServerConfig, ResolvedMCPServer | ||
|
|
||
|
|
||
| class MCPResolver: | ||
| def __init__( | ||
| self, | ||
| *, | ||
| secret_provider: MCPSecretProvider, | ||
| missing_secret_policy: MissingSecretPolicy = MissingSecretPolicy.ERROR, | ||
| ) -> None: |
There was a problem hiding this comment.
Breaks declared layer direction by importing tools model into MCP.
MCPResolver currently depends on agenta.sdk.agents.tools.models.MissingSecretPolicy, but this cohort declares tools as depending on MCP, not the other way around. This reverse edge can create import-order fragility and circular dependency risk as the stack evolves. Move MissingSecretPolicy to a neutral/shared module (or MCP/shared contract module) and import it from both subsystems.
Possible direction
- from agenta.sdk.agents.tools.models import MissingSecretPolicy
+ from agenta.sdk.agents.shared.missing_secret_policy import MissingSecretPolicy(then define/move the enum in that shared module and update tools imports accordingly)
| out = stdout.decode("utf-8", "replace") | ||
| err = stderr.decode("utf-8", "replace") | ||
| if not out.strip(): | ||
| raise RuntimeError( | ||
| f"Agent runner returned no output. exit={proc.returncode} stderr={err[-2000:]}" | ||
| ) | ||
| try: | ||
| return json.loads(out) | ||
| except json.JSONDecodeError as exc: |
There was a problem hiding this comment.
Treat non-zero subprocess exit as transport failure even with parseable JSON.
Line 74 returns parsed JSON without checking proc.returncode; a crashed runner can look successful if it emitted partial/legacy JSON before exiting non-zero.
Suggested fix
@@ async def deliver_subprocess(...):
out = stdout.decode("utf-8", "replace")
err = stderr.decode("utf-8", "replace")
+ if proc.returncode not in (0, None):
+ raise RuntimeError(
+ "Agent runner exited non-zero. "
+ f"exit={proc.returncode} stderr={err[-2000:]} stdout={out[:500]}"
+ )
if not out.strip():
raise RuntimeError(
f"Agent runner returned no output. exit={proc.returncode} stderr={err[-2000:]}"
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| out = stdout.decode("utf-8", "replace") | |
| err = stderr.decode("utf-8", "replace") | |
| if not out.strip(): | |
| raise RuntimeError( | |
| f"Agent runner returned no output. exit={proc.returncode} stderr={err[-2000:]}" | |
| ) | |
| try: | |
| return json.loads(out) | |
| except json.JSONDecodeError as exc: | |
| out = stdout.decode("utf-8", "replace") | |
| err = stderr.decode("utf-8", "replace") | |
| if proc.returncode not in (0, None): | |
| raise RuntimeError( | |
| "Agent runner exited non-zero. " | |
| f"exit={proc.returncode} stderr={err[-2000:]} stdout={out[:500]}" | |
| ) | |
| if not out.strip(): | |
| raise RuntimeError( | |
| f"Agent runner returned no output. exit={proc.returncode} stderr={err[-2000:]}" | |
| ) | |
| try: | |
| return json.loads(out) | |
| except json.JSONDecodeError as exc: |
…kdown editor Three improvements to the toolbar'd markdown editor used by the Instructions and Skill drawers. - Link button now opens a popover asking for the URL (seeded with the existing link when the caret is on one) with Add/Update + Remove, instead of blindly applying the literal "https://". - Table controls: a size-picker popover inserts a table (rows×cols hover grid); when the caret is inside a table the same control becomes a menu for insert/ delete row and column + delete table. The Lexical engine already registered TablePlugin + the markdown table transformers — this just exposes them. - The editor now accepts a dropped Markdown file (.md/.markdown/.mdx/.txt or any text/* file): it reads the file text and replaces the content, with a drop overlay. Intercepts only file drags in the capture phase, so Lexical's own internal text drag/drop is untouched.
Apply the "comfortable" prose direction to how Markdown documents (AGENTS.md / SKILL.md) render, in both the live editor and the read-only preview. The shared Lexical editor theme rendered document headings far too large (h1 24px / 600 weight), gave fenced code blocks no block styling, and left tables/blockquotes inconsistent. Rather than change that theme globally (the prompt and chat editors share it), scope the new prose styles to the document editors via a `md-prose` wrapper class: - editor-theme.css: a `.md-prose` block with a calmer heading scale (17/14/13, 500 weight), bordered/rounded code blocks, an accent-rule italic blockquote, and roomier line-height. Uses antd semantic tokens so it adapts to dark mode. - MarkdownEditor: tag its rendered containers with `md-prose` (covers both the edit and preview panes, which both render through this editor). - MarkdownPreview: bring its marked/DOMPurify MD_CLASS in line with the same scale, plus table styling it previously lacked.
Follow-up to live testing of the document editor. - Headings: replace the single H2 button with a block-type menu (Normal text, Heading 1-3, Quote, Code block) that reflects the caret's current block. The editor only ever offered one heading level before. "Code block" inserts a real fenced block (the standard @lexical/code CodeNode, already registered in rich mode) instead of the inline-code button, which had made multi-line "code" read as a stack of inline chips. - Headings now render with a clearer hierarchy (20/16/14, 600 weight) so they stand apart from body text. - Lists: the shared theme gave ol and ul different left margins and tall items, so the two indented inconsistently. Normalize both to one padding + tight rows under .md-prose. - Table size-picker: its grid cells used a fixed light hex that vanished in dark mode. Switch to antd semantic tokens so the cells are visible in both themes. - Code block wraps (pre-wrap) and keeps its bordered/rounded surface.
…ghting The "Code block" option turned a multi-line selection into one code node per line via $setBlocksType — which rendered as a stack of inline-code chips with no block background, not a single block. Mirror the Lexical playground: on a range selection, insert one code node and write the text back with insertRawText so line breaks become code lines (collapsed selection still uses $setBlocksType). Also wire up code highlighting, which the rich editor never enabled: - MarkdownEditor registers `registerCodeHighlighting` (the CodeNode/ CodeHighlightNode types were registered but the highlighter was never turned on, so blocks rendered as plain monospace). Token colors come from the existing `editor-token*` theme classes (light + dark). - The toolbar shows a searchable language picker (`getCodeLanguageOptions`) when the caret is inside a code block, writing `CodeNode.setLanguage` so Prism highlights the right grammar.
`.editor-inner:not(.code-editor) .editor-code` (0-3-0) was overriding the document code-block padding/fill from `.md-prose .editor-code` (0-2-0), so blocks rendered with 8px padding instead of the intended 12px/14px. Scope the rule to the rendered input (`.editor-input:not(.markdown-view)`) to reach 0-4-0 and win, without touching the markdown-source view. Verified live: a created code block now computes padding 12px 14px with the bordered/rounded surface.
Two changes from live testing of the document editor. - Drop the inline-code (`</>`) toolbar button. It applied per-line inline-code formatting, which read as a stack of chips and was the thing being mistaken for a code block. Code now means one thing: the block-type menu's "Code block". - Move the language picker out of the toolbar onto the code block itself. A new CodeBlockLanguageMenu renders a floating language Select pinned to the top-right of the code block the caret is in (fixed-position portal, recomputed on editor update + scroll/resize), driving CodeNode.setLanguage for Prism highlighting. The block gets 40px top padding to seat it. Verified live in the running app: toolbar no longer shows an inline-code button; selecting lines + "Code block" yields one styled block with the language picker seated on its top-right, aligned to the block and tracking scroll.
…polish it
The code block carried two language indicators that overlapped: the editor's
built-in uppercase `::after` label (globals.css, `attr(data-highlight-language)`)
and the new interactive picker. Hide the built-in label on document code blocks
(0-6-1 selector to beat globals' 0-5-1) so only the picker shows.
Polish the picker: borderless + compact (24px control, 11px text, sans), and
render the friendly language name ("JavaScript") for the selected value instead
of the raw id.
Verified live in the running app: two code blocks in one editor render
independently; the picker seats on the active block's top-right with no overlap;
the built-in `::after` label computes to `none`.
…e focused one Each code block reserves a 40px top strip for the language picker, but the picker only rendered on the block the caret was in — so other blocks showed an empty gap. Render a picker for every code block in the document (iterating the root's CodeNodes rather than the selection), clipped to the editor's scroll viewport so blocks scrolled out of view don't float a picker over the toolbar. Read-only panes get the picker too (disabled), so previews also fill the strip. Verified live: with two code blocks and the caret in a heading (neither focused), both blocks render their picker aligned to their own top-right.
…ions
Triggers showed a plain sentence ("No triggers yet. Add an app trigger or a
schedule…") while Tools/MCP/Skills use "No X yet — <add link>". Match it:
"No triggers yet — add a trigger", where the link opens the existing add menu
(App trigger / Scheduled trigger), so the "app trigger or a schedule" intent is
kept in the menu rather than the copy.
To guarantee parity, extract the shared `AddTextLink` (was local to
AgentTemplateControl) into its own module and use it in both. AddTriggerDropdown
gains an optional `trigger` so the empty-state link reuses the header's menu.
Verified live: the Triggers link is byte-identical in class to the Skills link
and opens the same add menu.
… playground
Nested numbered lists rendered every level as decimal (1., 1.), so depth was
invisible. Lexical already DOM-nests indented items (the wrapper li carries
`editor-nested-list-item`, which is marker-less), and the markdown round-trip
preserves nesting at a 4-space indent (LIST_INDENT_SIZE). So this is purely a
marker-styling gap: add depth-cycling `list-style-type` via descendant selectors
like the playground — ordered 1 → A → a → I → i, unordered • → ◦ → ▪ — scoped to
the rendered document view, plus the matching rules in MarkdownPreview.
Verified live: a nested ordered item computes upper-alpha ("A.") under the
decimal parent, with the wrapper li marker-less.
… font - On a cold open the drawer is still animating and the markdown still hydrating, so the first rect read is stale and the picker only appeared once a later event (e.g. a hover) nudged a recompute. Re-run recompute across the next frame + a few timeouts so it settles on the real position without interaction. - The picker portals to <body>, outside the antd theme container, so antd's inner `.ant-select-selection-item` fell back to a serif (Times). Set the app font (`var(--ant-font-family)`) on the wrapper and force the inner item / placeholder / search input to inherit it. Verified live: now renders Inter.
…rkers CR #4915: - The file-drop handler was memoized on `[dropEnabled]` with an eslint-disable, so it could call a stale `onChange`. That's reachable: SkillFormView passes a fresh inline `onChange={(v) => set("body", v)}` every render. Make `handleChange` a `useCallback([onChange])`, hoist the pure drop predicates to module scope, and add `handleChange` to the drop deps — the disable is gone and exhaustive-deps is clean. - MarkdownPreview only carried ordered nesting to lower-alpha while the editor CSS continues to upper-roman / lower-roman, so depth 4-5 lists rendered differently in the preview. Add the missing two levels.
feat(frontend): markdown document editor — toolbar, code blocks, nested lists, rendering
…s (phase 1) Pure, verbatim move out of the 2070-line AgentTemplateControl into a new `agentTemplate/` folder — no logic change: - agentTemplateUtils.ts — enumLabel, countSummary, cloneItem - itemDescriptors.tsx — ItemDescriptor + describeTool/Mcp/Skill/Instruction and their classifiers (toolName, isFunctionTool, isStaticSkill, isEmbedRefSkill, mdPreview, …) - ItemRow.tsx — ItemAvatar, ItemRow, InstructionsFileRow Orchestrator re-imports them; dropped the now-dead phosphor/toolUtils imports. 2070 → 1628 lines. tsc + lint clean.
…ry (phase 2) Tools, MCP servers and skills were three near-identical copies of: a list body, a draft-then-save drawer, and the editing/commit/validate state. Collapse them onto one code path, driven by a per-kind registry: - itemKinds.tsx — ITEM_KINDS: field, describe, FormView, drawerTitle/width, editView, jsonOnly, isReadOnly, createSeed, draftInvalid per kind. - useConfigItemDrawer.ts — the shared editing/draft/commit/remove/validate machine, writing to each kind's array via the registry. - ConfigItemList.tsx — one list body (rows + empty state) for all three. - The three ConfigItemDrawer blocks become one registry-driven render. Behavior preserved exactly (each per-kind rule mapped 1:1). Orchestrator 1638 → 1423 lines. tsc + lint clean.
…Harness Move the agent-template model/connection/harness/sandbox state and the Model & harness + Advanced section JSX out of AgentTemplateControl into a dedicated useModelHarness hook. The two sections share one coupled state machine (the model/connection state feeds both), so they live together and the hook returns the summaries + section bodies the orchestrator renders. AgentTemplateControl drops from ~1.4k to 672 lines; behavior is unchanged.
Bump lexical core + all @lexical/* packages from ^0.40.0 to ^0.46.0
across oss, ee, and the @agenta/{ui,entities,entity-ui} packages
(plus the root @lexical/eslint-plugin).
The only code change required across the editor (10 custom nodes, 12
extensions, plugins, markdown transformers) is one mechanical fix for
the 0.46 removal of unsafe type parameters on node-traversal methods:
TableCellResizerPlugin used tableRow.getChildren<TableCellNode>(), now
getChildren() as TableCellNode[].
Move the agent-template Tools add/remove logic (inline function tools, builtin/gateway tools, async workflow-reference tools, and the derived selected-name set + referenceable-workflow pool) out of AgentTemplateControl into a useAgentTools hook. The orchestrator now just wires the handlers into the picker and renders. AgentTemplateControl drops to 580 lines (from ~2.1k before the split); behavior is unchanged.
Editing across a fenced code block with mixed { / {{ braces while
{{tokens}} exist could crash the editor with React "Maximum update
depth exceeded". Pre-existing (reproduces on 0.40), surfaced during
lexical upgrade testing.
Root cause: TokenTypeaheadPlugin called React setState (setAnchor/
setInputQuery) synchronously inside a Lexical registerUpdateListener. A
burst of commits drove setState → re-render → commit past React's depth
limit. A contributing factor was TokenPlugin.$transformNode scheduling
a nested editor.update() from inside a node transform to reposition the
caret, spawning extra commits.
Fix:
- TokenTypeaheadPlugin: compute the anchor synchronously but apply React
state in a coalesced queueMicrotask, identity-stable, never inside the
commit — no commit burst can exceed React's update-depth limit.
- TokenPlugin: reposition the caret synchronously via navigateCursor
instead of a nested editor.update() (behaviorally identical).
Verified: crash recipe + heavy stress no longer crash; auto-close and
manual-close caret behavior unchanged.
…nnel Addresses a review finding: after selecting a suggestion, the deferred microtask could re-anchor and reopen the menu right after selectOption closed it (the listener re-anchors on the caret position selectOption sets, and the deferred flush ran after the synchronous setAnchor(null)). Unify all anchor/query writers (update listener, selection, escape, click-outside) behind a single scheduleTypeahead → flushTypeahead channel. The last write before the microtask wins, so selectOption's close — which runs after the listener — wins by construction. Keeps the React update out of the synchronous Lexical commit (crash guard) and removes the need for ad-hoc suppression flags.
The lifted flushTypeahead used a component-scoped mountedRef (init true, set false on unmount) that was never reset in effect setup. Under React StrictMode's mount → cleanup → remount cycle it stayed false, so the flush bailed permanently in dev and the typeahead menu could stop opening. On React 19 a useState setter after unmount is a harmless no-op, so the guard was unnecessary machinery — removed it rather than patching the reset, eliminating the StrictMode footgun.
@agenta/ui supports React >=18.0.0, so the rationale shouldn't pin to React 19. setState-after-unmount is a no-op on the supported range; keep the StrictMode/microtask reasoning without the version-specific claim.
Address review nitpicks: move the item-avatar's static font styles into Tailwind classes (only background stays inline/dynamic), and compress the extracted modules' multi-line narrative comments to one-/two-line notes, keeping the genuinely surprising constraints (deliberate model non-clearing, vault-async slug, tracked harness-capabilities gap). No behavior change.
chore(frontend): upgrade Lexical 0.40 → 0.46
Clarify in itemKinds that tool creation seeds from the picker, not the registry's createSeed stub (defuses a false-positive review read of an empty-tool save path). Record two review items from PR #4923 as deferred against the upcoming schema-driven config redesign: tighter per-kind tool draft validation (kind discriminated union) and re-binding the named-connection slug on a provider change (gated on first-class provider via ModelSpec). Both are correct for the redesign, not the current code, so they live in the design doc's Deferred list.
…ate-control refactor(frontend): split AgentTemplateControl into focused modules
Context
big-agentsis the integration branch for the agent-workflows feature. Every agent PR targetsbig-agents(directly, or by stacking on one that does). The plan is to review and merge each sub-PR intobig-agents, then mergebig-agentsintomainas a single unit.This PR is a draft tracker. It stays open until all the open sub-PRs below are merged into
big-agents. The branch started from an empty commit, so the diff fills in as sub-PRs land.Integrated PRs
Each box gets checked when that PR is merged into
big-agents. Indented items stack on the item above them.SDK and service
Runner
big-agents(the relay-bug fix, the CI job, and a superset of its tests already landed via feat(agent): runner engines, HTTP server, tracing, and docker image #4778 + chore(agent): make sandbox-agent runner first-class #4786)Frontend
Hosting
Sandbox-agent deployment
Docs
Branch-only (no PR yet)
These design-doc branches are stacked on
big-agentsbut have no PR. Open one if you want them reviewed separately, otherwise they fold in with the docs.docs/agent-model-config-and-provider-authdocs/agent-skills-configdocs/agent-code-tool-sandboxdocs/agent-harness-capabilitiesNotes
big-agents(feat(agent): runner engines, HTTP server, tracing, and docker image #4778 + chore(agent): make sandbox-agent runner first-class #4786 already carry its tests, CI job, and relay-bug fix; itsversion.tswas stale["pi","rivet"]).big-agentsas chore(railway): add sandbox-agent preview deployment #4802 / chore(kubernetes): deploy sandbox-agent sidecar #4803 / ci(agent): build and test sandbox-agent images #4804.