Skip to content

docs(agent): agent builder tools design (triggers, cron, find-triggers)#4919

Draft
mmabrouk wants to merge 3 commits into
big-agentsfrom
docs/agent-builder-tools
Draft

docs(agent): agent builder tools design (triggers, cron, find-triggers)#4919
mmabrouk wants to merge 3 commits into
big-agentsfrom
docs/agent-builder-tools

Conversation

@mmabrouk

@mmabrouk mmabrouk commented Jun 28, 2026

Copy link
Copy Markdown
Member

What this designs

A new agent on Agenta should turn itself into a real app by chatting with the user: discover the tools it needs, connect the integrations, edit its own instructions, set a trigger or a cron job, and commit. This doc designs the trigger-and-cron half of that flow: the agent-facing builder tools, plus the one event-discovery endpoint the flow is missing.

Aligned to the approved #4917 build-kit overlay model: the builder tools join PLATFORM_OPS, ride the overlay the backend serves read-only on the inspect response, and are applied by the frontend on a playground run and excluded on commit. The agent service stays dumb (no run flag, no service-side merge). Every tool carries a design-interfaces role analysis.

The finding

The trigger and cron engine already ships as a full backend: event subscriptions, cron schedules, delivery logs, a Composio event catalog, a worker, and a per-minute tick. We do not build a scheduler. The gap is narrow. There is no agent-facing tool layer over that engine, and there is no way to search the event catalog. This design fills both.

What it adds

Builder tools over endpoints that already ship, plus one new backend piece.

  • create_schedule and create_subscription: mutating, self-targeted, approval-gated.
  • test_subscription: a probe that opens a real provider watch and blocks; approval-gated.
  • remove_schedule and remove_subscription, plus a pause and resume pair for each (over the existing delete and start/stop endpoints): the agent can take back down a schedule or subscription it set up. Mutating, approval-gated. New in this pass.
  • list_schedules, list_subscriptions, list_deliveries, list_connections: reads, default allow.
  • find_triggers: the one new backend piece. A keyword search over the event catalog at POST /api/triggers/discover, shaped like find_capabilities. It joins the catalog match with shared connection state, so the agent learns the event and whether it can subscribe to it in one call.

Two role rules run through the whole set. The destination is routing, and it is bound server-side from run context the way commit_revision binds the variant id, so the model can only target the running agent itself. The connection is a credential reference (an id, never a secret). The agent holds the reference and asks the frontend to make the connection through request_connection, a non-runnable reference tool the overlay embeds via @ag.embed (owned by #4920), not one of the platform ops in this set.

The part to read closely

Section 5 walks the agent-driven build flow end to end on a concrete ask: run yourself on every new GitHub issue in acme/app, and triage it. The key finding is the dry-vs-live test split. A live test (test_subscription) needs the connection, because it long-polls the provider for a real event. A dry test does not: the agent runs itself against the catalog's sample payload with no connection. That split sets the build order. The recommendation is sample-first: build and dry-test the mapping offline, show the user the agent working, then ask for the connection and go live with one live test as the prove-it follow-up.

Decided this pass

The finer questions are settled and recorded in section 8: dry test is same-session for v1, test_subscription permission is ask, and the public invoke wrapper is deferred. One open question remains (section 9): test order in the build skill (lean sample-first).

Scope / risk

Doc only, no code. The builder tools reach a run through the build-kit overlay the frontend applies and never enter stored config, so a published production agent never carries them (section 6). This design depends on the connection round-trip (owned by the frontend round-trip PR) and the skills that teach the flow (owned by the platform skills PR).

Related PRs

Part of the "agent builds an app" initiative. Read the map first: #4921.

https://claude.ai/code/session_01GYo3UEfvsZpncagqb28Mbc

@vercel

vercel Bot commented Jun 28, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 28, 2026 9:47pm

Request Review

@coderabbitai

coderabbitai Bot commented Jun 28, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: c34bfd34-c8b7-45b6-92da-684995289545

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/agent-builder-tools

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@mmabrouk

Copy link
Copy Markdown
Member Author

🤖 The AI agent says:

What this covers. The tools and skills the build flow still needs. The trigger and cron backend already ships in full, so what is missing is the thin agent-facing tools over it plus event discovery. This PR also covers the agent-driven trigger-test UX.

Where to look. Section 3 (the missing tools and find_triggers), and especially section 4 (the agent-driven trigger flow walked end to end, including the connection branch and the test split).

Feedback needed. Please confirm:

  1. The tool set: create_schedule, create_subscription (self-targeted, approval-gated), list_schedules, list_subscriptions, list_deliveries, list_connections, test_subscription, plus the new find_triggers.
  2. The keyword find_triggers endpoint (POST /api/triggers/discover), built now rather than a hardcoded enumeration skill.
  3. The agent-driven trigger-test UX: sample-first as the default versus capture-then-promote.

And please weigh in on the five smaller open questions (README section 7):

  1. Test order: sample-first (Option B) as the default, with a live test as the prove-it follow-up.
  2. Same-session versus new-session dry test (lean same-session for v1).
  3. test_subscription permission: ask or allow (lean ask).
  4. find_triggers scope: keyword endpoint now versus a skill-only stand-in (lean endpoint).
  5. The test gap: a public /api invoke wrapper this round, or same-session dry testing plus the playground for v1.

@mmabrouk

Copy link
Copy Markdown
Member Author

🤖 The AI agent says:

The doc was rewritten for quality. The first version was rejected, so this is a clean rewrite. The structure is tighter, every tool now carries a design-interfaces role analysis (input, config, routing, credentials, metadata, policy), and the whole doc aligns to inject-not-commit: the builder tools ride the injected build kit and never enter stored config.

What to review. The tool set (section 4), the new find_triggers endpoint at POST /api/triggers/discover (4.5), and the agent-driven build flow walked end to end (section 5), especially the dry-vs-live test split in 5.1. A live test needs the connection; a dry test runs against the catalog sample with no connection. That finding sets the build order.

Four open questions, all non-blocking (section 8), with recommendations:

  1. Test order in the skill. Sample-first as the default, with a live test as the prove-it follow-up. Recommend sample-first.
  2. Same-session or new-session dry test. Same-session for v1 avoids standing up a new invoke surface. Recommend same-session.
  3. test_subscription permission. ask (it opens a real provider watch and blocks) versus allow (it is non-destructive). Recommend ask.
  4. The test gap. A public /api invoke wrapper now, versus same-session dry testing plus the playground for v1. Recommend deferring the wrapper.

|---|---|---|---|
| `find_triggers` | `POST /api/triggers/discover` (new) | read | allow |
| `create_schedule` | `POST /api/triggers/schedules/` | mutate | ask, approval |
| `create_subscription` | `POST /api/triggers/subscriptions/` | mutate | ask, approval |

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that we need also the ability to remove subscriptions and remove triggers, at least for the current application, because you might add them and then you change your mind.

follow-up (5.1). A skill-authoring choice, low risk to change later. Recommend sample-first.
2. **Same-session or new-session dry test.** Same-session for v1 avoids a new invoke surface
(5.3). Recommend same-session.
3. **`test_subscription` permission.** `ask` (it opens a real provider watch and blocks) or `allow`

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k


1. **Test order in the skill.** Sample-first as the default, with a live test as the prove-it
follow-up (5.1). A skill-authoring choice, low risk to change later. Recommend sample-first.
2. **Same-session or new-session dry test.** Same-session for v1 avoids a new invoke surface

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same sesssion

(5.3). Recommend same-session.
3. **`test_subscription` permission.** `ask` (it opens a real provider watch and blocks) or `allow`
(it is non-destructive)? Recommend `ask`.
4. **The test gap.** A public `/api` invoke wrapper now, or same-session dry testing plus the

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont understand this point

@mmabrouk mmabrouk left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mmabrouk

Copy link
Copy Markdown
Member Author

🤖 The AI agent says: Aligned this doc to the approved #4917 build-kit overlay model and added the undo tools.

What changed:

  • Section 6 (how tools reach the agent) and the section 3 decisions are now the overlay model: the builder tools join PLATFORM_OPS and ride the build-kit overlay at additional_context.playground_build_kit.agent_template_overlay as { "type": "platform", "op": ... } entries; the frontend applies the overlay on a playground run and excludes it on commit. Removed the "backend injects" and run-flag language.
  • New tools (please review): remove_schedule and remove_subscription, plus a pause and resume pair for each (over the existing DELETE and /start /stop endpoints), so the agent can undo a schedule or subscription it set up. They are mutating and approval-gated. Added to the tool-set table (section 4) with a short contract in the new section 4.6.
  • Folded the decided questions into a new "Decided" section (section 8): dry test is same-session for v1, test_subscription permission is ask, defer the public invoke wrapper. Open questions trimmed to one (section 9): test order in the skill.
  • Clarified that request_connection is a non-runnable reference tool the overlay embeds via @ag.embed (per docs(agent): frontend round-trip design (client tools, commit refresh, connections) #4920), not a platform op in this set.

Please confirm the undo tool set (remove + pause/resume for both schedules and subscriptions) and the approval gating on them.

| `create_schedule` | `POST /api/triggers/schedules/` | mutate | ask, approval |
| `create_subscription` | `POST /api/triggers/subscriptions/` | mutate | ask, approval |
| `test_subscription` | `POST /api/triggers/subscriptions/test` | probe | ask, approval |
| `remove_schedule` | `DELETE /api/triggers/schedules/{id}` | mutate | ask, approval |

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New undo tools: remove_schedule / remove_subscription delete by id, and the pause/resume pairs below map onto the existing /start and /stop endpoints. All mutating and approval-gated, so the agent can take back down what it set up. Contract in section 4.6.

adding each op to the overlay's `tools` list as a `{ "type": "platform", "op": ... }` entry. Adding
the builder tools to `PLATFORM_OPS` is therefore the whole integration: the new tools join the
overlay with no extra wiring. The backend serves the overlay read-only on the inspect response at
`additional_context.playground_build_kit.agent_template_overlay`; the frontend applies it on a

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Section 6 now states the overlay model precisely: the backend assembles PLATFORM_OPS into the overlay served read-only here, and the frontend applies it. No backend injection, no run flag. Adding a builder op to PLATFORM_OPS is the whole integration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-review Agent updated; awaiting Mahmoud's review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant