docs(agent): agent-creation skills + custom-tools design note by mmabrouk · Pull Request #4863 · Agenta-AI/agenta

mmabrouk · 2026-06-25T23:10:10Z

What

DRAFT for morning review. A new workspace of skills that let a person, or an agent harness, create agents on Agenta over the API, plus a design note for the gateway tools that would let a harness build agents itself.

Lives under docs/design/agent-workflows/projects/agent-creation-skills/. The skill files are drafts under skills/; when approved they move to .agents/skills/<name>/ (symlinked into .claude/skills/), per repo convention. Nothing wired into runtime behavior.

Why

We want users (and harnesses) to be able to build agents on Agenta without reverse-engineering the API. These skills capture the create -> commit -> invoke loop, the agent config schema, tool discovery, harness selection, and secrets, all verified against the live stack.

skills/create-agenta-agent/ — SKILL.md (the procedure) + reference.md (full field tables, tools catalog, secrets schema, harness capability map) + create_agent.py (a runnable uv run script that does the whole loop with --archive cleanup).
skills/self-host-agenta/ — SKILL.md + reference.md: run the stack with/without the sandbox-agent sidecar, Claude via API key vs subscription (OAuth) login, the Daytona sandbox, and the trust model.
custom-tools-design.md — design note for the agent-self-creation tools: search_tools (Composio has ~1000 integrations, search is essential), create_workflow, invoke_workflow, update_own_workflow, add_trace_annotation, plus companions. Design only; each maps to a verified endpoint.
README.md — the verified API reference. build-notes.md — decisions and judgment calls.

Verified live

Every API call was run against the dev stack on 2026-06-26 (project hotel-agent, cheap models). The full create workflow -> create variant -> commit revision (parameters.agent) -> invoke loop returned correct answers ("...Paris.", "...Tokyo."); update appended a version; secrets create/delete and tools search/resolve confirmed. The bundled script ran end to end and cleaned up after itself. Test workflows archived; fake secret deleted.

Key corrections to circulating lore, documented as gotchas: the config payload key is parameters.agent (not ag_config); the builtin URI goes in data.uri (not data.url, which validates as HTTP); the variant request wrapper is workflow_variant/workflow_id (the old .http files use variant/artifact_id and now 422).

Review asks

Is the custom-tools set the right shape, especially update_own_workflow (self-scoped) vs a general update_workflow, and the invoke_workflow recursion/budget guard? See the "Open questions for review" in the design note.
Should the skills move to .agents/skills/ now, or stay drafts until the custom tools land?

Design only / docs only. Not for merge yet.

https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

vercel · 2026-06-25T23:10:16Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 27, 2026 6:23pm

coderabbitai · 2026-06-25T23:10:19Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: fa0f3452-f465-4817-b4ab-0cf9bb6aafa8

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

The PR adds documentation for agent creation workflows, a create-and-invoke script, runtime and tool references, a custom-tools design note, and self-hosting guidance with companion reference material.

Changes

Agent creation workflow docs

Layer / File(s)	Summary
Workspace framing and verification `docs/design/agent-workflows/projects/agent-creation-skills/README.md`, `docs/design/agent-workflows/projects/agent-creation-skills/build-notes.md`	The workspace introduction, verification notes, source map, and workspace contents are documented.
Create and invoke flow `docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/SKILL.md`, `docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/create_agent.py`	The create-workflow, create-variant, commit-revision, invoke, archive, and script loop steps are documented and scripted.
Config and tool contracts `docs/design/agent-workflows/projects/agent-creation-skills/README.md`, `docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/reference.md`, `docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/SKILL.md`	Agent config schema, harness capability checks, tool discovery, and connection flow details are documented.
Secrets and lifecycle notes `docs/design/agent-workflows/projects/agent-creation-skills/README.md`, `docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/reference.md`, `docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/SKILL.md`	Vault secret kinds, auth and routing behavior, revision updates, and operational gotchas are documented.

Custom tool surface design

Layer / File(s)	Summary
Tool surface and guardrails `docs/design/agent-workflows/projects/agent-creation-skills/custom-tools-design.md`	The tool list, permission model, recursion limits, annotation behavior, and review questions are documented.

Self-hosting docs

Layer / File(s)	Summary
Topology and runtime selection `docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/SKILL.md`, `docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/reference.md`	The self-host stack layout, runtime environment variables, and runner path selection are documented.
Authentication and sandboxing `docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/SKILL.md`, `docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/reference.md`	Claude authentication modes, sidecar isolation, Daytona sandbox settings, verification steps, and source map notes are documented.

🎯 3 (Moderate) | ⏱️ ~30 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 60.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main changes: agent-creation skills and the custom-tools design note.
Description check	✅ Passed	The description is clearly related to the documented skills, design note, and verified API workflow changes.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/agent-creation-skills

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

mmabrouk · 2026-06-25T23:10:22Z

@coderabbitai review

This is a docs-only draft (skills + a design note). Please focus on: (1) the agent-creation API calls in the create-agenta-agent skill and reference are internally consistent and correct, (2) the custom-tools design note's tool boundaries make sense (self-scoped update, recursion/budget guards), (3) any factual drift in the self-host trust model. Behavior changes / implementation are out of scope for this PR.

coderabbitai · 2026-06-25T23:12:43Z

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Locate relevant files =="
fd -p 'README.md|build-notes.md|custom-tools-design.md|SKILL.md|reference.md|create_agent.py' docs/design/agent-workflows/projects/agent-creation-skills || true

echo
echo "== Tree =="
tree -a -L 4 docs/design/agent-workflows/projects/agent-creation-skills || true

echo
echo "== Read top-level docs =="
for f in \
  docs/design/agent-workflows/projects/agent-creation-skills/README.md \
  docs/design/agent-workflows/projects/agent-creation-skills/build-notes.md \
  docs/design/agent-workflows/projects/agent-creation-skills/custom-tools-design.md
do
  echo
  echo "----- FILE: $f -----"
  sed -n '1,260p' "$f"
done

echo
echo "== Read create-agenta-agent skill docs =="
for f in \
  docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/SKILL.md \
  docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/reference.md \
  docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/create_agent.py
do
  echo
  echo "----- FILE: $f -----"
  sed -n '1,320p' "$f"
done

echo
echo "== Read self-host-agenta skill docs =="
for f in \
  docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/SKILL.md \
  docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/reference.md
do
  echo
  echo "----- FILE: $f -----"
  sed -n '1,320p' "$f"
done

echo
echo "== Search for key terms across project =="
rg -n -C2 --glob '*.md' --glob '*.py' \
  'parameters\.agent|ag_config|data\.uri|data\.url|workflow_variant|workflow_id|variant|artifact_id|invoke_workflow|update_own_workflow|update_workflow|recursion|budget|trust|sandbox-agent|Daytona|OAuth|subscription|Composio|search_tools|secrets|archive' \
  docs/design/agent-workflows/projects/agent-creation-skills

Length of output: 149378

@mmabrouk Reviewed with the docs-only scope in mind.

Overall, this is in good shape. The create-agent flow is internally consistent across README.md, skills/create-agenta-agent/SKILL.md, skills/create-agenta-agent/reference.md, and create_agent.py:

data.parameters.agent is used consistently, not ag_config
data.uri is used consistently, not data.url
variant creation consistently uses workflow_variant + workflow_id
the create → commit → invoke loop and the secret/tool wiring all line up with each other

On the design side, I think the tool boundaries are sensible:

update_own_workflow as the default/self-scoped mutation surface is the safer first shape
keeping arbitrary-target updates as a separate, higher-privilege future tool makes sense
recursion depth + per-run budget caps on invoke_workflow are the right guardrails

I only found two non-blocking clarifications worth tightening:

In docs/design/agent-workflows/projects/agent-creation-skills/custom-tools-design.md, invoke_workflow says it accepts a workflow ref by id/slug and will “resolve the revision,” but it doesn’t say which variant/revision wins when there are multiple branches or commits. I’d recommend spelling out the resolution order to avoid surprising tool behavior.
In docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/reference.md, the sentence saying “the backend is always the sandbox-agent backend” reads a bit at odds with the earlier no-sidecar/local-spawn description. If the intent is “same runner semantics, different transport,” I’d reword that line to make the trust model section unambiguous.

No blocking factual drift jumped out to me beyond those wording clarifications.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/create_agent.py (1)

32-163: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

LGTM with two notes.

The script correctly implements the full loop with proper payload shapes. Two issues to address:

Resource leak on partial failure: If an intermediate step fails (e.g., variant creation fails after workflow creation), the created workflow is orphaned because archive only runs on full success. Consider a try/except or contextlib.ExitStack pattern to track created resources and clean up on failure.

Coding guidelines compliance: As per coding guidelines, Python files require ruff format then ruff check --fix before committing.

Source: Coding guidelines

♻️ Duplicate comments (15)

docs/design/agent-workflows/projects/agent-creation-skills/README.md (4)

68-79: 📐 Maintainability & Code Quality | 🟡 Minor | 💤 Low value

Add json language specifier to fenced code block.

+```json
POST /api/workflows/variants/
{
"workflow_variant": {


<!-- cr-comment:v1:713a3b15ad64ef925f2c1bb8 -->

---

`86-109`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
POST /api/workflows/revisions/commit
{
"workflow_revision": {


<!-- cr-comment:v1:e33e495f0c7bc1b337bb0035 -->

---

`113-123`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
POST /services/agent/v0/invoke
Accept: application/json


<!-- cr-comment:v1:98805001850b3f983af97258 -->

---

`205-215`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
POST /api/vault/v1/secrets/
{
"header": { "name": "OpenAI", "description": "..." },


<!-- cr-comment:v1:74bbfa0f35c69c4562b9e731 -->

</blockquote></details>
<details>
<summary>docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/SKILL.md (6)</summary><blockquote>

`66-72`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `bash` language specifier to fenced code block.**

```diff

+```bash
curl -sS -X POST "$HOST/api/workflows/variants/?project_id=$PROJECT" \


<!-- cr-comment:v1:839692a7dc64580454e44033 -->

---

`80-97`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `bash` language specifier to fenced code block.**

```diff

+```bash
curl -sS -X POST "$HOST/api/workflows/revisions/commit?project_id=$PROJECT" \


<!-- cr-comment:v1:2320f8cbd23221c5729a2b50 -->

---

`121-129`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `bash` language specifier to fenced code block.**

```diff

+```bash
curl -sS -X POST "$HOST/services/agent/v0/invoke?project_id=$PROJECT" \


<!-- cr-comment:v1:49aad829ed09769b59746a1a -->

---

`145-151`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `bash` language specifier to fenced code block.**

```diff

+```bash
curl -sS -X POST "$HOST/api/vault/v1/secrets/?project_id=$PROJECT" \


<!-- cr-comment:v1:98f92131d1feb402e7f2466b -->

---

`159-165`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `bash` language specifier to fenced code block.**

```diff

+```bash
curl -sS -X POST "$HOST/api/tools/connections/?project_id=$PROJECT" \


<!-- cr-comment:v1:5b6489e7244d5eaa29894f06 -->

---

`182-189`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `bash` language specifier to fenced code block.**

```diff

+```bash
curl -sS -X POST "$HOST/api/workflows/revisions/commit?project_id=$PROJECT" \


<!-- cr-comment:v1:aa65c142cd14306800891ba8 -->

</blockquote></details>
<details>
<summary>docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/reference.md (5)</summary><blockquote>

`49-68`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
// builtin: a harness-native tool (read, write, bash, ...)


<!-- cr-comment:v1:ce5fdc3fc7613f114e5a1d5a -->

---

`74-79`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
{ "name": "filesystem", "transport": "stdio",


<!-- cr-comment:v1:fbbe7e64a60c1cdf777eca2f -->

---

`88-93`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
{ "name": "release-notes",


<!-- cr-comment:v1:447e0179861f9ce2b5d54962 -->

---

`154-158`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
{ "header": { "name": "OpenAI", "description": "..." },


<!-- cr-comment:v1:4251f63a96205c73e4654ca9 -->

---

`165-171`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
{ "header": { "name": "MyAzure" },


<!-- cr-comment:v1:e5ce91a5e7c8a272fe8198eb -->

</blockquote></details>

</blockquote></details>

<details>
<summary>🧹 Nitpick comments (4)</summary><blockquote>

<details>
<summary>docs/design/agent-workflows/projects/agent-creation-skills/custom-tools-design.md (1)</summary><blockquote>

`77-79`: _📐 Maintainability & Code Quality_ | _🔵 Trivial_ | _⚡ Quick win_

**Clarify how `invoke_workflow` resolves ambiguous workflow refs.**

The current description says "resolve the revision if a ref is given" without specifying the resolution rule when multiple branches or commits exist. To make the design unambiguous, specify whether resolution uses:
- the latest committed revision across all variants,
- the default variant's latest revision, or
- an explicit variant + revision selector.

This was previously identified as a needed clarification and remains unaddressed in the current draft.

<!-- cr-comment:v1:7c908fb2da356be772d6c2ea -->

</blockquote></details>
<details>
<summary>docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/SKILL.md (1)</summary><blockquote>

`63-65`: _📐 Maintainability & Code Quality_ | _🔵 Trivial_ | _💤 Low value_

**Add a language tag to the fenced code block.**

The env-var snippet at line 63 triggers markdownlint `MD040`. Tag it as `bash` or `shell` so the linter stays clean.

```diff
-```
+```bash
 AGENTA_AGENT_RUNNER_URL=http://sandbox-agent:8765   # services -> sidecar over the compose net

docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/reference.md (2)

41-42: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Clarify "backend" to avoid conflating the abstraction with the sidecar container.

This sentence is technically correct in code terms (select_backend always returns SandboxAgentBackend), but readers may read "backend" as "the sidecar container" and wonder how that squares with the local-spawn path that has no sidecar. Rephrase to make clear that the abstraction is constant while the transport (HTTP to sidecar vs local CLI spawn) varies.
-The backend is always the sandbox-agent backend; only the transport (`RUNNER_URL`) and the
-sandbox axis vary.
+The backend abstraction is always `SandboxAgentBackend`; only the transport (`RUNNER_URL`) and
+the sandbox axis vary. When `RUNNER_URL` is set, `services` POSTs to the sidecar container.
+When unset, the backend spawns the runner CLI locally from `AGENTA_AGENT_RUNNER_DIR`.
10-16: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Optionally tag the ASCII diagram as text to silence markdownlint.

The topology block triggers MD040. Adding text satisfies the linter without affecting rendering.
-```
+```text
 client -> traefik -> api        (/api/...)        management: workflows, vault, tools

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 1f1b944e-c2f4-4504-9c15-22056a6754ce

📥 Commits

Reviewing files that changed from the base of the PR and between 4bbf659 and 77be3c3.

📒 Files selected for processing (8)

docs/design/agent-workflows/projects/agent-creation-skills/README.md
docs/design/agent-workflows/projects/agent-creation-skills/build-notes.md
docs/design/agent-workflows/projects/agent-creation-skills/custom-tools-design.md
docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/SKILL.md
docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/create_agent.py
docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/reference.md
docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/SKILL.md
docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/reference.md

coderabbitai · 2026-06-25T23:18:15Z

+```
+POST /api/workflows/
+{
+  "workflow": {
+    "slug": "my-agent-<unique>",
+    "name": "My Agent",
+    "description": "Agent via API",
+    "flags": { "is_custom": false }
+  }
+}
+-> 200 { "workflow": { "id": "<workflow_id>", "slug": "...", ... } }
+```


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Add json language specifier to fenced code block.

+```json
POST /api/workflows/
{
"workflow": {

Improves syntax highlighting and readability. Same applies to the other JSON blocks below. <details> <summary>🧰 Tools</summary> <details> <summary>🪛 markdownlint-cli2 (0.22.1)</summary> [warning] 50-50: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details>   

mmabrouk · 2026-06-26T17:53:43Z

+
+- Inputs: `name`, `slug?`, `agent_config` (the `parameters.agent` object), `variant_slug?`,
+  `message?`.
+- Behavior: `POST /api/workflows/` -> `POST /api/workflows/variants/` ->


I mean, this would be in the tools in the API. It does not make sense to have calls between two API endpoints. We would be doing it with direct service calls, right?

mmabrouk · 2026-06-26T17:54:18Z

+  optional `stream`.
+- Behavior: resolve the revision if a ref is given, then `POST /services/agent/v0/invoke`.
+  Returns the assistant output and the `trace_id`.
+- Guardrail: depth/recursion limit so a builder agent cannot fork-bomb itself; a per-run


I wonder how you plan to do that?

mmabrouk · 2026-06-26T17:54:54Z

+  Returns the assistant output and the `trace_id`.
+- Guardrail: depth/recursion limit so a builder agent cannot fork-bomb itself; a per-run
+  budget cap. Cheap-model default for self-tests.
+- Permission: gateable; read-ish but it spends model budget.


I mean, here it would make sense also to be able to spin off yourself a subagent, although that happens sometimes in the harness itself. Nothing actionable here. We're just going to have to keep in mind.

mmabrouk · 2026-06-26T17:55:21Z

+test what it created.
+
+- Inputs: `workflow_ref` (id or slug) or inline `agent_config`, plus `messages` / `inputs`,
+  optional `stream`.


What happens if we invoke with streaming? Will we be able to stream the results of the sub-agent from the agent's response, or is this just irrelevant?

mmabrouk · 2026-06-26T17:55:51Z

+Run a workflow (by id/slug, or an inline config) and get the output. Lets a builder agent
+test what it created.
+
+- Inputs: `workflow_ref` (id or slug) or inline `agent_config`, plus `messages` / `inputs`,


It's very nice that we'll be able to have a line agent config. This means you can spin off a version of yourself, per se, and, for example, test something: testing a skill, changing to a skill and how that affects you. Probably that's something that makes sense to add to the thing that you want to have in skills.

mmabrouk · 2026-06-26T18:00:00Z

+the running agent's own workflow so it cannot rewrite arbitrary agents.
+
+- Inputs: `agent_config` (the new config), `message`.
+- Behavior: resolve the running agent's own `workflow_variant_id` from the run context, then


This means that the agent should have access to some run context that is propagated somehow, no? How does it get access to this run context? Does it maybe have a tool, kind of a built-in tool, to get run context? Like something implemented on the sidecar or in the SDK? Because probably, for example in this case, the app.py will provide that context to the agent, like to the sidecar, but then that would not be injected as is into agent.md or something. It would be kind of as a tool, and then, if that tool is called, the question is where the boundary lies. Do you return to the app.py to get that information? You see, okay, it's actually part of the sidecar, and it will kind of return it. I don't know if that's the case.

What is this context in the case of running this from the SDK, like with a local backend? Right, we would have another route where, instead of running this in the sidecar, you would pull the configuration and then you would use the SDK to run this. Obviously, here the SDK will do the job of app.py. It will, I don't know, if there's a variant, etc., a trace ID, whatever. It will put it in context, and then it will kind of be the same as the sidecar setup, the tools, whatever, in the way that makes sense. In this case, through an MCP or whatever that way that makes sense, and then I guess that's where it would also add a tool to have the context and to inject it. I guess in that case, what would be the type of the tool? I mean, we didn't think a lot about the local backend. How would that work, how the tool injection would work? I mean, with close code, probably the way is too, if it's something, an endpoint like a gateway tool, that's not a problem, right? We need to have an MCP, and that MCP kind of has that call. If it's a code tool, we can also put it in MCP. We don't care. In the case of Pi, we would put it in the folder, like all these tools.

I have big questions, like how do you provide context? Obviously, you could have an endpoint, but then that would be a little bit stupid, so if you have better ideas, please let me know.

mmabrouk · 2026-06-27T11:30:32Z

There seems to be a misunderstanding. These skills are supposed to be the skills within Agenta and not the skills used to develop Agenta, which are in the claude folder or whatever.

These skills would be the skills within custom workflow, as in constant custom workflows.

mmabrouk · 2026-06-27T16:41:03Z

🤖 The AI agent says: Added a SUPERSEDED banner to custom-tools-design.md. The logic-wrapping tools here (create_workflow / update_own_workflow / add_trace_annotation / …) were the rejected first version; the current design is thin wrappers over existing endpoints (see projects/direct-call-tools/). Flagging so this note stops misleading reviewers (and Codex).

…l (Codex) interfaces.md: the concrete interface delta (call descriptor, runContext, type:platform config, platform-op catalog entry, reference config, dispatch body-assembly) with the file:line seams, so review + implementation share one reference. plan.md: the endpoint-wrapper set to expose (from #4863, reframed as existing endpoints; plain + self/bind variants incl. the annotate pair) + catalog implementation per the Codex catalog review (typed catalog in SDK platform/op_catalog.py, reuse PlatformConnection, stable tools.agenta.* namespace, per-op permission defaults; execution stays direct, find_capabilities migrates to direct).

Add a draft workspace for skills that let a person or an agent harness create agents on Agenta over the API, plus a design note for the gateway tools that would let a harness build agents. - create-agenta-agent skill (SKILL.md + reference.md + runnable create_agent.py): discover tools, the agent config schema, the create/variant/commit/invoke loop, harness selection, secrets. Every call verified live on the dev stack. - self-host-agenta skill: run the stack with/without the sandbox-agent sidecar, Claude via API key vs subscription (OAuth) login, Daytona sandbox, trust model. - custom-tools-design.md: search_tools, create_workflow, invoke_workflow, update_own_workflow, add_trace_annotation + companions; design only. - README.md (verified API reference) + build-notes.md (decisions). Skills live under the project workspace as drafts; they move to .agents/skills/ when approved. Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

The logic-wrapping tools framing (create_workflow/update_own_workflow/add_trace_annotation/...) was the rejected first version. Platform tools are a thin wrapper over existing endpoints; see projects/direct-call-tools/. Banner points readers to the current design.

…l (Codex) interfaces.md: the concrete interface delta (call descriptor, runContext, type:platform config, platform-op catalog entry, reference config, dispatch body-assembly) with the file:line seams, so review + implementation share one reference. plan.md: the endpoint-wrapper set to expose (from #4863, reframed as existing endpoints; plain + self/bind variants incl. the annotate pair) + catalog implementation per the Codex catalog review (typed catalog in SDK platform/op_catalog.py, reuse PlatformConnection, stable tools.agenta.* namespace, per-op permission defaults; execution stays direct, find_capabilities migrates to direct).

mmabrouk · 2026-06-28T00:11:41Z

🤖 The AI agent says:

Reconciling the custom-tools note here with what just shipped. Your line-67 point — no calls
between two API endpoints; the agent should call Agenta endpoints directly — is now implemented as
platform tools in the direct-call-tools project (Phase 3b, PR #4893).

A platform tool is a thin wrapper over an existing Agenta endpoint, declared as
{type:"platform", op:"<catalog-op>"}. The "how" lives in a typed platform-op catalog
(sdks/python/agenta/sdk/agents/platform/op_catalog.py), not in tool logic: the catalog maps an
op to {description, method, path, input_schema, bind, default permission/approval}, and the
resolver emits a direct call to that endpoint — no /tools/call hop, no endpoint-to-endpoint
indirection.

So platform tools = the catalog. First three ops: find_capabilities (discovery, now
agent-usable end to end), query_workflows (read), commit_revision (self-update; binds the
agent's own variant, gated on approval). More ops are a data add to the catalog. There are no
new endpoints and no logic-wrapping tools — we expose the endpoints we have, and a skill teaches
multi-step composition. Full write-up: documentation/tools.md → "Platform tools".

vercel Bot deployed to Preview June 25, 2026 23:11 View deployment

coderabbitai Bot reviewed Jun 25, 2026

View reviewed changes

mmabrouk force-pushed the docs/agent-creation-skills branch from 77be3c3 to e87eb9a Compare June 26, 2026 14:29

vercel Bot deployed to Preview June 26, 2026 14:30 View deployment

mmabrouk commented Jun 26, 2026

View reviewed changes

This was referenced Jun 27, 2026

[feat] Tool discovery: POST /tools/discover + reserved find_capabilities tool #4884

Merged

docs(agent): direct-call tools design (Workstream A) #4886

Draft

mmabrouk force-pushed the docs/agent-creation-skills branch from e87eb9a to cbff4ef Compare June 27, 2026 15:17

vercel Bot deployed to Preview June 27, 2026 15:18 View deployment

vercel Bot deployed to Preview June 27, 2026 16:40 View deployment

mmabrouk added 2 commits June 27, 2026 20:20

mmabrouk force-pushed the docs/agent-creation-skills branch from 85387a4 to d40ea9c Compare June 27, 2026 18:22

vercel Bot deployed to Preview June 27, 2026 18:23 View deployment

mmabrouk added the needs-review Agent updated; awaiting Mahmoud's review label Jun 27, 2026

Uh oh!

Conversation

mmabrouk commented Jun 25, 2026

What

Why

Contents

Verified live

Review asks

Uh oh!

vercel Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

mmabrouk commented Jun 25, 2026

Uh oh!

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk commented Jun 27, 2026

Uh oh!

mmabrouk commented Jun 27, 2026

Uh oh!

mmabrouk commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 25, 2026 •

edited

Loading

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading