Skip to content

docs(agent): agent-creation skills + custom-tools design note#4863

Draft
mmabrouk wants to merge 2 commits into
big-agentsfrom
docs/agent-creation-skills
Draft

docs(agent): agent-creation skills + custom-tools design note#4863
mmabrouk wants to merge 2 commits into
big-agentsfrom
docs/agent-creation-skills

Conversation

@mmabrouk

Copy link
Copy Markdown
Member

What

DRAFT for morning review. A new workspace of skills that let a person, or an agent harness, create agents on Agenta over the API, plus a design note for the gateway tools that would let a harness build agents itself.

Lives under docs/design/agent-workflows/projects/agent-creation-skills/. The skill files are drafts under skills/; when approved they move to .agents/skills/<name>/ (symlinked into .claude/skills/), per repo convention. Nothing wired into runtime behavior.

Why

We want users (and harnesses) to be able to build agents on Agenta without reverse-engineering the API. These skills capture the create -> commit -> invoke loop, the agent config schema, tool discovery, harness selection, and secrets, all verified against the live stack.

Contents

  • skills/create-agenta-agent/ — SKILL.md (the procedure) + reference.md (full field tables, tools catalog, secrets schema, harness capability map) + create_agent.py (a runnable uv run script that does the whole loop with --archive cleanup).
  • skills/self-host-agenta/ — SKILL.md + reference.md: run the stack with/without the sandbox-agent sidecar, Claude via API key vs subscription (OAuth) login, the Daytona sandbox, and the trust model.
  • custom-tools-design.md — design note for the agent-self-creation tools: search_tools (Composio has ~1000 integrations, search is essential), create_workflow, invoke_workflow, update_own_workflow, add_trace_annotation, plus companions. Design only; each maps to a verified endpoint.
  • README.md — the verified API reference. build-notes.md — decisions and judgment calls.

Verified live

Every API call was run against the dev stack on 2026-06-26 (project hotel-agent, cheap models). The full create workflow -> create variant -> commit revision (parameters.agent) -> invoke loop returned correct answers ("...Paris.", "...Tokyo."); update appended a version; secrets create/delete and tools search/resolve confirmed. The bundled script ran end to end and cleaned up after itself. Test workflows archived; fake secret deleted.

Key corrections to circulating lore, documented as gotchas: the config payload key is parameters.agent (not ag_config); the builtin URI goes in data.uri (not data.url, which validates as HTTP); the variant request wrapper is workflow_variant/workflow_id (the old .http files use variant/artifact_id and now 422).

Review asks

  • Is the custom-tools set the right shape, especially update_own_workflow (self-scoped) vs a general update_workflow, and the invoke_workflow recursion/budget guard? See the "Open questions for review" in the design note.
  • Should the skills move to .agents/skills/ now, or stay drafts until the custom tools land?

Design only / docs only. Not for merge yet.

https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

@vercel

vercel Bot commented Jun 25, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 27, 2026 6:23pm

Request Review

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown

Review Change Stack

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: fa0f3452-f465-4817-b4ab-0cf9bb6aafa8

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The PR adds documentation for agent creation workflows, a create-and-invoke script, runtime and tool references, a custom-tools design note, and self-hosting guidance with companion reference material.

Changes

Agent creation workflow docs

Layer / File(s) Summary
Workspace framing and verification
docs/design/agent-workflows/projects/agent-creation-skills/README.md, docs/design/agent-workflows/projects/agent-creation-skills/build-notes.md
The workspace introduction, verification notes, source map, and workspace contents are documented.
Create and invoke flow
docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/SKILL.md, docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/create_agent.py
The create-workflow, create-variant, commit-revision, invoke, archive, and script loop steps are documented and scripted.
Config and tool contracts
docs/design/agent-workflows/projects/agent-creation-skills/README.md, docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/reference.md, docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/SKILL.md
Agent config schema, harness capability checks, tool discovery, and connection flow details are documented.
Secrets and lifecycle notes
docs/design/agent-workflows/projects/agent-creation-skills/README.md, docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/reference.md, docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/SKILL.md
Vault secret kinds, auth and routing behavior, revision updates, and operational gotchas are documented.

Custom tool surface design

Layer / File(s) Summary
Tool surface and guardrails
docs/design/agent-workflows/projects/agent-creation-skills/custom-tools-design.md
The tool list, permission model, recursion limits, annotation behavior, and review questions are documented.

Self-hosting docs

Layer / File(s) Summary
Topology and runtime selection
docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/SKILL.md, docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/reference.md
The self-host stack layout, runtime environment variables, and runner path selection are documented.
Authentication and sandboxing
docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/SKILL.md, docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/reference.md
Claude authentication modes, sidecar isolation, Daytona sandbox settings, verification steps, and source map notes are documented.

🎯 3 (Moderate) | ⏱️ ~30 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 60.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: agent-creation skills and the custom-tools design note.
Description check ✅ Passed The description is clearly related to the documented skills, design note, and verified API workflow changes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/agent-creation-skills

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@mmabrouk

Copy link
Copy Markdown
Member Author

@coderabbitai review

This is a docs-only draft (skills + a design note). Please focus on: (1) the agent-creation API calls in the create-agenta-agent skill and reference are internally consistent and correct, (2) the custom-tools design note's tool boundaries make sense (self-scoped update, recursion/budget guards), (3) any factual drift in the self-host trust model. Behavior changes / implementation are out of scope for this PR.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Locate relevant files =="
fd -p 'README.md|build-notes.md|custom-tools-design.md|SKILL.md|reference.md|create_agent.py' docs/design/agent-workflows/projects/agent-creation-skills || true

echo
echo "== Tree =="
tree -a -L 4 docs/design/agent-workflows/projects/agent-creation-skills || true

echo
echo "== Read top-level docs =="
for f in \
  docs/design/agent-workflows/projects/agent-creation-skills/README.md \
  docs/design/agent-workflows/projects/agent-creation-skills/build-notes.md \
  docs/design/agent-workflows/projects/agent-creation-skills/custom-tools-design.md
do
  echo
  echo "----- FILE: $f -----"
  sed -n '1,260p' "$f"
done

echo
echo "== Read create-agenta-agent skill docs =="
for f in \
  docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/SKILL.md \
  docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/reference.md \
  docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/create_agent.py
do
  echo
  echo "----- FILE: $f -----"
  sed -n '1,320p' "$f"
done

echo
echo "== Read self-host-agenta skill docs =="
for f in \
  docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/SKILL.md \
  docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/reference.md
do
  echo
  echo "----- FILE: $f -----"
  sed -n '1,320p' "$f"
done

echo
echo "== Search for key terms across project =="
rg -n -C2 --glob '*.md' --glob '*.py' \
  'parameters\.agent|ag_config|data\.uri|data\.url|workflow_variant|workflow_id|variant|artifact_id|invoke_workflow|update_own_workflow|update_workflow|recursion|budget|trust|sandbox-agent|Daytona|OAuth|subscription|Composio|search_tools|secrets|archive' \
  docs/design/agent-workflows/projects/agent-creation-skills

Length of output: 149378



@mmabrouk Reviewed with the docs-only scope in mind.

Overall, this is in good shape. The create-agent flow is internally consistent across README.md, skills/create-agenta-agent/SKILL.md, skills/create-agenta-agent/reference.md, and create_agent.py:

  • data.parameters.agent is used consistently, not ag_config
  • data.uri is used consistently, not data.url
  • variant creation consistently uses workflow_variant + workflow_id
  • the create → commit → invoke loop and the secret/tool wiring all line up with each other

On the design side, I think the tool boundaries are sensible:

  • update_own_workflow as the default/self-scoped mutation surface is the safer first shape
  • keeping arbitrary-target updates as a separate, higher-privilege future tool makes sense
  • recursion depth + per-run budget caps on invoke_workflow are the right guardrails

I only found two non-blocking clarifications worth tightening:

  1. In docs/design/agent-workflows/projects/agent-creation-skills/custom-tools-design.md, invoke_workflow says it accepts a workflow ref by id/slug and will “resolve the revision,” but it doesn’t say which variant/revision wins when there are multiple branches or commits. I’d recommend spelling out the resolution order to avoid surprising tool behavior.

  2. In docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/reference.md, the sentence saying “the backend is always the sandbox-agent backend” reads a bit at odds with the earlier no-sidecar/local-spawn description. If the intent is “same runner semantics, different transport,” I’d reword that line to make the trust model section unambiguous.

No blocking factual drift jumped out to me beyond those wording clarifications.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/create_agent.py (1)

32-163: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

LGTM with two notes.

The script correctly implements the full loop with proper payload shapes. Two issues to address:

  1. Resource leak on partial failure: If an intermediate step fails (e.g., variant creation fails after workflow creation), the created workflow is orphaned because archive only runs on full success. Consider a try/except or contextlib.ExitStack pattern to track created resources and clean up on failure.

  2. Coding guidelines compliance: As per coding guidelines, Python files require ruff format then ruff check --fix before committing.

Source: Coding guidelines

♻️ Duplicate comments (15)
docs/design/agent-workflows/projects/agent-creation-skills/README.md (4)

68-79: 📐 Maintainability & Code Quality | 🟡 Minor | 💤 Low value

Add json language specifier to fenced code block.

+```json
POST /api/workflows/variants/
{
"workflow_variant": {


<!-- cr-comment:v1:713a3b15ad64ef925f2c1bb8 -->

---

`86-109`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
POST /api/workflows/revisions/commit
{
"workflow_revision": {


<!-- cr-comment:v1:e33e495f0c7bc1b337bb0035 -->

---

`113-123`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
POST /services/agent/v0/invoke
Accept: application/json


<!-- cr-comment:v1:98805001850b3f983af97258 -->

---

`205-215`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
POST /api/vault/v1/secrets/
{
"header": { "name": "OpenAI", "description": "..." },


<!-- cr-comment:v1:74bbfa0f35c69c4562b9e731 -->

</blockquote></details>
<details>
<summary>docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/SKILL.md (6)</summary><blockquote>

`66-72`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `bash` language specifier to fenced code block.**

```diff

+```bash
curl -sS -X POST "$HOST/api/workflows/variants/?project_id=$PROJECT" \


<!-- cr-comment:v1:839692a7dc64580454e44033 -->

---

`80-97`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `bash` language specifier to fenced code block.**

```diff

+```bash
curl -sS -X POST "$HOST/api/workflows/revisions/commit?project_id=$PROJECT" \


<!-- cr-comment:v1:2320f8cbd23221c5729a2b50 -->

---

`121-129`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `bash` language specifier to fenced code block.**

```diff

+```bash
curl -sS -X POST "$HOST/services/agent/v0/invoke?project_id=$PROJECT" \


<!-- cr-comment:v1:49aad829ed09769b59746a1a -->

---

`145-151`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `bash` language specifier to fenced code block.**

```diff

+```bash
curl -sS -X POST "$HOST/api/vault/v1/secrets/?project_id=$PROJECT" \


<!-- cr-comment:v1:98f92131d1feb402e7f2466b -->

---

`159-165`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `bash` language specifier to fenced code block.**

```diff

+```bash
curl -sS -X POST "$HOST/api/tools/connections/?project_id=$PROJECT" \


<!-- cr-comment:v1:5b6489e7244d5eaa29894f06 -->

---

`182-189`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `bash` language specifier to fenced code block.**

```diff

+```bash
curl -sS -X POST "$HOST/api/workflows/revisions/commit?project_id=$PROJECT" \


<!-- cr-comment:v1:aa65c142cd14306800891ba8 -->

</blockquote></details>
<details>
<summary>docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/reference.md (5)</summary><blockquote>

`49-68`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
// builtin: a harness-native tool (read, write, bash, ...)


<!-- cr-comment:v1:ce5fdc3fc7613f114e5a1d5a -->

---

`74-79`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
{ "name": "filesystem", "transport": "stdio",


<!-- cr-comment:v1:fbbe7e64a60c1cdf777eca2f -->

---

`88-93`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
{ "name": "release-notes",


<!-- cr-comment:v1:447e0179861f9ce2b5d54962 -->

---

`154-158`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
{ "header": { "name": "OpenAI", "description": "..." },


<!-- cr-comment:v1:4251f63a96205c73e4654ca9 -->

---

`165-171`: _📐 Maintainability & Code Quality_ | _🟡 Minor_ | _💤 Low value_

**Add `json` language specifier to fenced code block.**

```diff

+```json
{ "header": { "name": "MyAzure" },


<!-- cr-comment:v1:e5ce91a5e7c8a272fe8198eb -->

</blockquote></details>

</blockquote></details>

<details>
<summary>🧹 Nitpick comments (4)</summary><blockquote>

<details>
<summary>docs/design/agent-workflows/projects/agent-creation-skills/custom-tools-design.md (1)</summary><blockquote>

`77-79`: _📐 Maintainability & Code Quality_ | _🔵 Trivial_ | _⚡ Quick win_

**Clarify how `invoke_workflow` resolves ambiguous workflow refs.**

The current description says "resolve the revision if a ref is given" without specifying the resolution rule when multiple branches or commits exist. To make the design unambiguous, specify whether resolution uses:
- the latest committed revision across all variants,
- the default variant's latest revision, or
- an explicit variant + revision selector.

This was previously identified as a needed clarification and remains unaddressed in the current draft.

<!-- cr-comment:v1:7c908fb2da356be772d6c2ea -->

</blockquote></details>
<details>
<summary>docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/SKILL.md (1)</summary><blockquote>

`63-65`: _📐 Maintainability & Code Quality_ | _🔵 Trivial_ | _💤 Low value_

**Add a language tag to the fenced code block.**

The env-var snippet at line 63 triggers markdownlint `MD040`. Tag it as `bash` or `shell` so the linter stays clean.

```diff
-```
+```bash
 AGENTA_AGENT_RUNNER_URL=http://sandbox-agent:8765   # services -> sidecar over the compose net
docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/reference.md (2)

41-42: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Clarify "backend" to avoid conflating the abstraction with the sidecar container.

This sentence is technically correct in code terms (select_backend always returns SandboxAgentBackend), but readers may read "backend" as "the sidecar container" and wonder how that squares with the local-spawn path that has no sidecar. Rephrase to make clear that the abstraction is constant while the transport (HTTP to sidecar vs local CLI spawn) varies.

-The backend is always the sandbox-agent backend; only the transport (`RUNNER_URL`) and the
-sandbox axis vary.
+The backend abstraction is always `SandboxAgentBackend`; only the transport (`RUNNER_URL`) and
+the sandbox axis vary. When `RUNNER_URL` is set, `services` POSTs to the sidecar container.
+When unset, the backend spawns the runner CLI locally from `AGENTA_AGENT_RUNNER_DIR`.

10-16: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Optionally tag the ASCII diagram as text to silence markdownlint.

The topology block triggers MD040. Adding text satisfies the linter without affecting rendering.

-```
+```text
 client -> traefik -> api        (/api/...)        management: workflows, vault, tools

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 1f1b944e-c2f4-4504-9c15-22056a6754ce

📥 Commits

Reviewing files that changed from the base of the PR and between 4bbf659 and 77be3c3.

📒 Files selected for processing (8)
  • docs/design/agent-workflows/projects/agent-creation-skills/README.md
  • docs/design/agent-workflows/projects/agent-creation-skills/build-notes.md
  • docs/design/agent-workflows/projects/agent-creation-skills/custom-tools-design.md
  • docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/SKILL.md
  • docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/create_agent.py
  • docs/design/agent-workflows/projects/agent-creation-skills/skills/create-agenta-agent/reference.md
  • docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/SKILL.md
  • docs/design/agent-workflows/projects/agent-creation-skills/skills/self-host-agenta/reference.md

Comment on lines +50 to +61
```
POST /api/workflows/
{
"workflow": {
"slug": "my-agent-<unique>",
"name": "My Agent",
"description": "Agent via API",
"flags": { "is_custom": false }
}
}
-> 200 { "workflow": { "id": "<workflow_id>", "slug": "...", ... } }
```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Add json language specifier to fenced code block.

+```json
POST /api/workflows/
{
"workflow": {


Improves syntax highlighting and readability. Same applies to the other JSON blocks below.

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>

[warning] 50-50: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<!-- cr-indicator-types:potential_issue -->

<!-- cr-comment:v1:b9250bb7c234a4add284ab54 -->

<!-- This is an auto-generated comment by CodeRabbit -->


- Inputs: `name`, `slug?`, `agent_config` (the `parameters.agent` object), `variant_slug?`,
`message?`.
- Behavior: `POST /api/workflows/` -> `POST /api/workflows/variants/` ->

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, this would be in the tools in the API. It does not make sense to have calls between two API endpoints. We would be doing it with direct service calls, right?

optional `stream`.
- Behavior: resolve the revision if a ref is given, then `POST /services/agent/v0/invoke`.
Returns the assistant output and the `trace_id`.
- Guardrail: depth/recursion limit so a builder agent cannot fork-bomb itself; a per-run

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how you plan to do that?

Returns the assistant output and the `trace_id`.
- Guardrail: depth/recursion limit so a builder agent cannot fork-bomb itself; a per-run
budget cap. Cheap-model default for self-tests.
- Permission: gateable; read-ish but it spends model budget.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, here it would make sense also to be able to spin off yourself a subagent, although that happens sometimes in the harness itself. Nothing actionable here. We're just going to have to keep in mind.

test what it created.

- Inputs: `workflow_ref` (id or slug) or inline `agent_config`, plus `messages` / `inputs`,
optional `stream`.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we invoke with streaming? Will we be able to stream the results of the sub-agent from the agent's response, or is this just irrelevant?

Run a workflow (by id/slug, or an inline config) and get the output. Lets a builder agent
test what it created.

- Inputs: `workflow_ref` (id or slug) or inline `agent_config`, plus `messages` / `inputs`,

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very nice that we'll be able to have a line agent config. This means you can spin off a version of yourself, per se, and, for example, test something: testing a skill, changing to a skill and how that affects you. Probably that's something that makes sense to add to the thing that you want to have in skills.

the running agent's own workflow so it cannot rewrite arbitrary agents.

- Inputs: `agent_config` (the new config), `message`.
- Behavior: resolve the running agent's own `workflow_variant_id` from the run context, then

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that the agent should have access to some run context that is propagated somehow, no? How does it get access to this run context? Does it maybe have a tool, kind of a built-in tool, to get run context? Like something implemented on the sidecar or in the SDK? Because probably, for example in this case, the app.py will provide that context to the agent, like to the sidecar, but then that would not be injected as is into agent.md or something. It would be kind of as a tool, and then, if that tool is called, the question is where the boundary lies. Do you return to the app.py to get that information? You see, okay, it's actually part of the sidecar, and it will kind of return it. I don't know if that's the case.

What is this context in the case of running this from the SDK, like with a local backend? Right, we would have another route where, instead of running this in the sidecar, you would pull the configuration and then you would use the SDK to run this. Obviously, here the SDK will do the job of app.py. It will, I don't know, if there's a variant, etc., a trace ID, whatever. It will put it in context, and then it will kind of be the same as the sidecar setup, the tools, whatever, in the way that makes sense. In this case, through an MCP or whatever that way that makes sense, and then I guess that's where it would also add a tool to have the context and to inject it. I guess in that case, what would be the type of the tool? I mean, we didn't think a lot about the local backend. How would that work, how the tool injection would work? I mean, with close code, probably the way is too, if it's something, an endpoint like a gateway tool, that's not a problem, right? We need to have an MCP, and that MCP kind of has that call. If it's a code tool, we can also put it in MCP. We don't care. In the case of Pi, we would put it in the folder, like all these tools.

I have big questions, like how do you provide context? Obviously, you could have an endpoint, but then that would be a little bit stupid, so if you have better ideas, please let me know.

@mmabrouk

Copy link
Copy Markdown
Member Author

There seems to be a misunderstanding. These skills are supposed to be the skills within Agenta and not the skills used to develop Agenta, which are in the claude folder or whatever.

These skills would be the skills within custom workflow, as in constant custom workflows.

@mmabrouk

Copy link
Copy Markdown
Member Author

🤖 The AI agent says: Added a SUPERSEDED banner to custom-tools-design.md. The logic-wrapping tools here (create_workflow / update_own_workflow / add_trace_annotation / …) were the rejected first version; the current design is thin wrappers over existing endpoints (see projects/direct-call-tools/). Flagging so this note stops misleading reviewers (and Codex).

mmabrouk added a commit that referenced this pull request Jun 27, 2026
…l (Codex)

interfaces.md: the concrete interface delta (call descriptor, runContext, type:platform config,
platform-op catalog entry, reference config, dispatch body-assembly) with the file:line seams, so
review + implementation share one reference. plan.md: the endpoint-wrapper set to expose (from
#4863, reframed as existing endpoints; plain + self/bind variants incl. the annotate pair) +
catalog implementation per the Codex catalog review (typed catalog in SDK platform/op_catalog.py,
reuse PlatformConnection, stable tools.agenta.* namespace, per-op permission defaults; execution
stays direct, find_capabilities migrates to direct).
mmabrouk added 2 commits June 27, 2026 20:20
Add a draft workspace for skills that let a person or an agent harness create
agents on Agenta over the API, plus a design note for the gateway tools that
would let a harness build agents.

- create-agenta-agent skill (SKILL.md + reference.md + runnable create_agent.py):
  discover tools, the agent config schema, the create/variant/commit/invoke
  loop, harness selection, secrets. Every call verified live on the dev stack.
- self-host-agenta skill: run the stack with/without the sandbox-agent sidecar,
  Claude via API key vs subscription (OAuth) login, Daytona sandbox, trust model.
- custom-tools-design.md: search_tools, create_workflow, invoke_workflow,
  update_own_workflow, add_trace_annotation + companions; design only.
- README.md (verified API reference) + build-notes.md (decisions).

Skills live under the project workspace as drafts; they move to .agents/skills/
when approved.

Claude-Session: https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc
The logic-wrapping tools framing (create_workflow/update_own_workflow/add_trace_annotation/...) was
the rejected first version. Platform tools are a thin wrapper over existing endpoints; see
projects/direct-call-tools/. Banner points readers to the current design.
@mmabrouk mmabrouk force-pushed the docs/agent-creation-skills branch from 85387a4 to d40ea9c Compare June 27, 2026 18:22
mmabrouk added a commit that referenced this pull request Jun 27, 2026
…l (Codex)

interfaces.md: the concrete interface delta (call descriptor, runContext, type:platform config,
platform-op catalog entry, reference config, dispatch body-assembly) with the file:line seams, so
review + implementation share one reference. plan.md: the endpoint-wrapper set to expose (from
#4863, reframed as existing endpoints; plain + self/bind variants incl. the annotate pair) +
catalog implementation per the Codex catalog review (typed catalog in SDK platform/op_catalog.py,
reuse PlatformConnection, stable tools.agenta.* namespace, per-op permission defaults; execution
stays direct, find_capabilities migrates to direct).
@mmabrouk mmabrouk added the needs-review Agent updated; awaiting Mahmoud's review label Jun 27, 2026
@mmabrouk

Copy link
Copy Markdown
Member Author

🤖 The AI agent says:

Reconciling the custom-tools note here with what just shipped. Your line-67 point — no calls
between two API endpoints; the agent should call Agenta endpoints directly
— is now implemented as
platform tools in the direct-call-tools project (Phase 3b, PR #4893).

A platform tool is a thin wrapper over an existing Agenta endpoint, declared as
{type:"platform", op:"<catalog-op>"}. The "how" lives in a typed platform-op catalog
(sdks/python/agenta/sdk/agents/platform/op_catalog.py), not in tool logic: the catalog maps an
op to {description, method, path, input_schema, bind, default permission/approval}, and the
resolver emits a direct call to that endpoint — no /tools/call hop, no endpoint-to-endpoint
indirection.

So platform tools = the catalog. First three ops: find_capabilities (discovery, now
agent-usable end to end), query_workflows (read), commit_revision (self-update; binds the
agent's own variant, gated on approval). More ops are a data add to the catalog. There are no
new endpoints and no logic-wrapping tools — we expose the endpoints we have, and a skill teaches
multi-step composition. Full write-up: documentation/tools.md → "Platform tools".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-review Agent updated; awaiting Mahmoud's review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant