docs(agent): platform skills design for agent self-building#4918
docs(agent): platform skills design for agent self-building#4918mmabrouk wants to merge 3 commits into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
🤖 The AI agent says: What this covers. The skills that teach a new agent the build flow: which skills we ship, how they are named, and the embed shape. Tools do the actions; skills teach the order. The doc carries placeholder bodies; the real prose lands later. Where to look. Section 4 (the skill set), section 5 (the slug, embed, and tool-declaration contracts), and section 6 (embed not force). Feedback needed. Please confirm:
Note: the "self-modify versus separate app" question is now marked decided in the doc (Reading A, the agent becomes the app), so it is no longer open. |
|
|
||
| This skill has a different audience from the build skills. See section 3. | ||
|
|
||
| ### `self-host-agenta` (draft, out of scope) |
There was a problem hiding this comment.
I think many of these skills are not in the big agents branch. They are just kind of local things that are applied in merged branches, so they were just drafts for exploration. I wouldn't take them too much seriously. At most, they are starting points, but you can reorganize things.
There was a problem hiding this comment.
My suggestion is actually to use that, rename things, etc., and put them in the right format, and obviously reorganize them. Maybe close the PR that has them and create a new PR after we have done this with a new branch, etc. Obviously, orchestration is your thing.
There was a problem hiding this comment.
🤖 The AI agent says: Agreed. Section 2 now states the drafts were exploration and raw material; the set is derived from the build flow in 2.1 and 2.2, not restated from the drafts.
There was a problem hiding this comment.
🤖 The AI agent says: Done for content. discover-and-wire-tools is promoted with edits (3.3); create-agenta-agent stays an external skill with a naming fix; all reorganized. The new-branch step is orchestration and will be handled separately.
There was a problem hiding this comment.
🧹 Nitpick comments (4)
docs/design/agent-workflows/projects/agent-skills/design.md (4)
66-66: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueRepetitive sentence starts reduce readability.
Three successive sentences begin with "It" (lines 66-68). Tighten by combining or rewording:
"Well-written and mostly current, it speaks the in-agent voice..."
76-76: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueRepetitive sentence starts reduce readability.
Three successive sentences begin with "It" (lines 76-78). Tighten by rewording, e.g.:
"Verified against a live stack, it carries heavy naming debt..."
446-449: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winIndicate a default recommendation for the skill set count.
Open question 2 asks whether to fold
set-up-triggersintobuild-your-first-app. The doc proposes four skills but immediately questions whether three is better. State the default recommendation explicitly (e.g., "Default: keep four; fold only if we want to minimize surface area for the first ship") so Mahmoud knows what happens if he has no strong opinion.
397-397: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueCross-reference the "design-interfaces rule" explicitly.
The doc references a "design-interfaces rule" as a known principle. Add a brief inline definition or link to where this rule is documented, since readers may not recognize the reference.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 4398b497-6641-4765-959e-8b260c7b3e94
📒 Files selected for processing (2)
docs/design/agent-workflows/projects/agent-skills/design.mddocs/design/agent-workflows/projects/agent-skills/status.md
| } | ||
| ``` | ||
|
|
||
| Which skills carry naming debt: |
There was a problem hiding this comment.
Again, I mean, yeah, obviously let's make them work with a new format, but these are just high-level skills. We need to think exactly: what do we mean by these skills, and what do we want them to enable the features that we want to achieve? Again, look at what the requirements are, and then let's make sure that we have the exact skills we need for that and scope and organize them correctly. These are just helpful examples to start with, but nothing more. I think some of them have some logic because they were tested, so let's not lose that information and leverage it if we can so that we don't spend more tokens about it, but not more than that.
There was a problem hiding this comment.
🤖 The AI agent says: Section 2.1 lists the build jobs, 2.2 maps each to whether it needs a skill and why, and 2.3 keeps the tested discover-and-wire-tools loop. The set now falls out of requirements.
| Three things here are contracts: the slug, the embed shape, and how a skill declares its tools. | ||
| Run the role analysis on each. | ||
|
|
||
| ### Slug versus name |
There was a problem hiding this comment.
I am really getting lost here. I think the structure of this design is not how we usually do this, so please use the plan feature skill and create something that I can review. This is unreviewable. The information is not structured correctly. It's very hard to know: there is a part about which skills do we need, which tools do we need to allow self-building, and I would like to review that. There is also a part about how we implement this, which is a second problem, and I would also like to review that, but how do we implement this? It's not structured correctly. What does it mean, slug versus name? Why do I care about this? What's the context here? Use pyramid, start high level and then go low level.
There was a problem hiding this comment.
🤖 The AI agent says: The doc is rebuilt around this: Part A (which skills) versus Part B (how it is delivered), summary-first. The tools picture is one table in 2.1. Slug now leads with why it matters in 5.1 (it is the immutable routing key the catalog derives ids from).
|
|
||
| ```jsonc | ||
| { | ||
| "@ag.embed": { |
There was a problem hiding this comment.
I think this is also related to something else that another subagent is looking at, as in how we specify these defaults, etc. You need to read and talk with the orchestrator to try to understand what we reach there from decisions, because I'm not sure if this is aligned. Obviously, the fact that it's embedded makes sense, but for the word to put the default, etc., I think there are decisions that have changed over time there.
There was a problem hiding this comment.
🤖 The AI agent says: Section 4 adopts the inject-not-commit model from default-agent-config and flags one open coordination point (8.3).
| because the default config guarantees the tools and the skills arrive together | ||
| (`default-agent-config`). The skill assumes co-presence. | ||
| - **Structured declaration (deferred).** Add a field to `SkillTemplate`, say | ||
| `requires_tools: [platform-op names]`, so the system can verify or auto-include the tools a |
There was a problem hiding this comment.
No, we're not going to change the skill definitions. Skill definition is a general standard, so there are no required tools. Take just these tools you just mentioned, the tools and the skills, and if the harness doesn't find them, it will let nobody use them.
There was a problem hiding this comment.
🤖 The AI agent says: Section 6 rejects requires_tools for exactly this reasoning and keeps SkillTemplate as the general harness standard.
| This restates the decision `default-agent-config` owns, because the skill content depends on it. | ||
|
|
||
| - Skills reach the agent through `@ag.embed` in the default config. Present by default. Removable. | ||
| - Stop force-injecting `agenta-getting-started` through the `pi_agenta` harness. Set |
There was a problem hiding this comment.
🤖 The AI agent says: Kept as a locked decision in sections 1 and 7: AGENTA_FORCED_SKILLS goes empty, the force mechanism stays for a future real-functionality skill.
|
|
||
| This restates the decision `default-agent-config` owns, because the skill content depends on it. | ||
|
|
||
| - Skills reach the agent through `@ag.embed` in the default config. Present by default. Removable. |
There was a problem hiding this comment.
No, this is not correct. Please coordinate with the other sub-agents to understand how we are implementing this, so it's going to be injected probably by the playground.
There was a problem hiding this comment.
🤖 The AI agent says: Section 4 rewrites the delivery model to inject-not-commit; the old embed framing is removed and called out as no longer correct.
|
|
||
| ## 8. Open questions for Mahmoud | ||
|
|
||
| 1. **Self-modify or separate app? DECIDED (Reading A).** The agent becomes the app: it edits |
There was a problem hiding this comment.
Ok, we decided 100 times, so it's not an open question. It's self-modified.
There was a problem hiding this comment.
🤖 The AI agent says: Moved to locked decisions in section 1 and removed from open questions.
| `build-your-first-app` over the `create_workflow` tools, not the embedded curl skill; that is | ||
| out of scope for this round. (Mirrors `agent-builder-capabilities` decision 1.) | ||
|
|
||
| 2. **The skill set.** I propose four embedded skills: `agenta-getting-started`, |
There was a problem hiding this comment.
It looks fine, as it just mentioned. I don't understand this. This design doc is bad and not clear enough for me to give feedback.
There was a problem hiding this comment.
🤖 The AI agent says: Full rewrite for clarity; section 2 motivates the set and section 8 question 1 states the default (keep four).
mmabrouk
left a comment
There was a problem hiding this comment.
The quality of the design docs is bad. I have provided feedback. Please rewrite. Taking into consideration all the similar feedback
|
🤖 The AI agent says: Rewrote the design from scratch per the review. The first version mixed the product question with the implementation question and read as unreviewable. The new doc follows plan-feature, leads with the answer, and splits into two parts you can judge on their own: Part A is which skills the build flow needs and why (it derives the set from the flow, not from the old drafts); Part B is how a skill reaches the agent, how it is named, and the contract it rides. The big content change is the delivery model: build skills are injected at run time by the playground build kit and never committed, replacing the old Four open questions remain, all in section 8:
|
| | 1 | Find the right action tools | `find_capabilities` | exists | | ||
| | 2 | See what already exists in the project | `query_workflows` | exists | | ||
| | 3 | Connect an integration the tools need | frontend round-trip (agent requests, cannot create) | frontend-owned | | ||
| | 3b | Check whether a connection is ready | `list_connections` | missing | |
There was a problem hiding this comment.
This one is a bit problematic, or at least I'm not sure how it works. It is the frontend that selects which connection is the one to use the tool for. What kind of information does the frontend get, and which one is secret and which one is not? What is this connection and how it would work? I really want more detail there on the flow, like how we would see the agent do it and making sure it does not get information that they do not need.
| - The build skills ride the playground build kit, alongside the platform tools and the build | ||
| permissions. The kit is a backend-defined set with one source of truth. | ||
| See [`../default-agent-config/design.md`](../default-agent-config/design.md). | ||
| - At run time, when the build kit is on, the backend merges the kit's skills into the effective |
There was a problem hiding this comment.
We're still working on the API, but these have changed, so it will be the frontend that will inject this in the configuration when it calls invoke or messages. The backend will just say which skills and tools are part of the playground build kit, and there won't be any changes to the server, so it won't be the one that injects them.
From our side, we need to build these skills and tools and make sure that they exist in the place where they need to be. Skills are these platform skills, and for tools we are adding these platform tools, sometimes even new backends for this, so that's kind of the responsibility here in this PR.
The second responsibility is to add these and inspect the format that other PR we decide into, exactly how it needs to be, so it's in deep synchronization with that other PR to make sure we added it in the right place.
|
|
||
| ## 8. Open questions | ||
|
|
||
| 1. **The skill set count.** The recommendation is four: `agenta-getting-started`, |
| fold `set-up-triggers` into `build-your-first-app` to ship a smaller surface first. Default: | ||
| keep four. The orchestrator stays a short map, and each focused skill owns one hard sub-flow. | ||
|
|
||
| 2. **Baseline behavior: skill or preamble.** `agenta-getting-started` overlaps the always-on |
| into the preamble, and drop the separate behavior skill, once we confirm nothing else depends | ||
| on the slug. | ||
|
|
||
| 3. **Does the kit inject the full build set.** `default-agent-config` describes the kit as |
| orchestrator can reference the focused ones (section 4). Confirm the kit's `skills` group | ||
| carries the set. | ||
|
|
||
| 4. **Single-sourcing the getting-started body.** The body lives twice today: the SDK constant |
There was a problem hiding this comment.
single sourcing in the right place
mmabrouk
left a comment
There was a problem hiding this comment.
Thank you. I've added some comments here. Please address them.
The things that are missing for me are:
- Where would the skill be added? Where exactly do we add these new platform skills?
- If I understand, these skills are workflows. We say the workflow type is skill, and they cannot be changed, so let's make sure we write that down. We're going to add them in this file, etc., this kind of constant thing.
- For the tools, where are we going to add each of these tools, the platform tools?
- Let's coordinate with the other agent about how it will be added to the frontend. Also, here specify, as soon as that looks good to me, as soon as the design docs are agreed on, we should wait here and coordinate with the orchestrator about it. As soon as that's done, then we can frighten with that big dig into the interface.
|
🤖 The AI agent says: Aligned this doc to the approved #4917 build-kit overlay model. What changed:
Please review the delivery model in section 4 and confirm the open question on whether the overlay should carry the full build set (the orchestrator needs the focused skills co-present). |
|
|
||
| The current model is a build-kit overlay. The build kit is an agent-template overlay, a partial | ||
| `parameters.agent` the backend serves read-only on the inspect response at | ||
| `additional_context.playground_build_kit.agent_template_overlay`. The platform tools and the build |
There was a problem hiding this comment.
The overlay now rides additional_context.playground_build_kit.agent_template_overlay on the inspect response, the approved location from #4917. This replaces the earlier run-time inject framing.
| permissions. The overlay is a backend-defined set with one source of truth. | ||
| See [`../default-agent-config/design.md`](../default-agent-config/design.md). | ||
| - Each build skill is an ordinary `@ag.embed` reference in that list, of the shape | ||
| `{ "@ag.embed": { "@ag.references": { "workflow": { "slug": "__ag__..." } } } }`. The reference |
There was a problem hiding this comment.
Each build skill is now an @ag.embed reference of this exact shape, identical to how #4920 embeds the request_connection client tool (only the slug differs). The reference carries the name and description, so the overlay adds no parallel fields.
What this designs
Which skills a new Agenta agent carries so a user can chat with it and build their first app, how those skills reach the agent, how they are named, and the contract they ride. Tools do the actions. Skills teach the agent which tools to use, in what order, where to stop for the human, and which footguns to avoid.
This project owns the skill content, the names, and the slugs. It does not own the build-kit overlay (
default-agent-configowns that), the build-flow tools the skills name (agent-builder-capabilities), or the connection round-trip (agent-fe-roundtrip).The skill set
Four skills, derived from the build flow rather than restated from the old drafts:
agenta-getting-started: baseline behavior, how an Agenta agent works. Exists today as a placeholder. Keep and single-source it.build-your-first-app: the orchestrator. Names the steps, the order, and the stop points. New.discover-and-wire-tools: find action tools and get their integrations connected. A tested draft, promoted and adapted.set-up-triggers: set up a cron job or an event trigger, and test it. New.The doc ships placeholder bodies that capture the build flow. The real prose lands later.
Delivery: the build-kit overlay
The build skills reach the agent through the build-kit overlay, an agent-template overlay the backend serves read-only on the inspect response at
additional_context.playground_build_kit.agent_template_overlay. Each build skill is an ordinary@ag.embedreference in the overlay'sskillslist. The frontend applies the overlay for a playground run (deep-merge object fields, identity-merge list fields) and excludes it on commit. The agent service stays dumb: no run flag, no service-side merge. The build skills are absent from the stored revision by construction, so there is nothing to strip. This replaces the first draft's "committed@ag.embed" framing and the second pass's "backend injects at run time" framing.Other decisions
@ag.embed, not by force-injection.__ag__plus the name with hyphens turned to underscores. The live getting-started slug stays as-is, because the catalog derives every id from it.requires_toolsfield.SkillTemplatestays the general harness standard. If a tool a skill names is absent, the agent simply cannot call it, and nothing in the skill breaks.The drift this flags
The getting-started body lives twice: the SDK constant (served at run time) and a
services/agent/skills/.../SKILL.mdfile the runner never loads. They have drifted. The doc proposes single-sourcing the body.Rewritten per review
This is a full rewrite of the first version, which read as unreviewable because it mixed the product question (which skills) with the implementation question (how a skill reaches the agent). The rewrite follows plan-feature: it splits into Part A (which skills, and why) and Part B (how a skill is delivered, named, and contracted), leads with the answer, and derives the set from the build flow.
Open questions
Four, in section 8: the skill count (keep four or fold triggers into the orchestrator), baseline behavior as a skill or as the AGENTS.md preamble, whether the overlay carries the full set, and how to single-source the getting-started body.
Scope / risk
Doc only, no code. The skills name tools that do not exist yet. They go live when the builder project ships them.
Related PRs
Part of the "agent builds an app" initiative. Read the map first: #4921.
https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc