feat(subagent): add stop-on-failure and bounded-effort guidance to general agent intro#2354
feat(subagent): add stop-on-failure and bounded-effort guidance to general agent intro#2354h3c-hexin wants to merge 1 commit into
Conversation
…neral agent intro The general-purpose sub-agent intro tells the agent how to plan, but says nothing about when to *stop*. With less capable models this leads to a failure mode where the agent retries the same failing tool call (e.g. an unreachable or rate-limited external API) over and over until it hits the elapsed/step ceiling, wasting the whole budget and returning nothing useful. Add two short clauses to GENERAL_AGENT_INTRO: - Stop quickly on failure: after the same call fails twice, return partial results with a one-line note instead of looping. - Bounded effort: prefer one focused attempt; if the task can't be completed within a few tool calls, return current findings so the parent can compensate. Prompt-only change; no behavioral code paths touched.
There was a problem hiding this comment.
Code Review
This pull request updates the GENERAL_AGENT_INTRO prompt in crates/tui/src/tools/subagent/mod.rs to instruct sub-agents to stop quickly on failure (after 2 consecutive tool call failures) and to bound their effort (preferring 3-5 tool calls and returning partial findings). There are no review comments, and I have no feedback to provide.
| "Plan multi-step work with `checklist_write`; add `update_plan` for complex strategy.\n\n" | ||
| "Plan multi-step work with `checklist_write`; add `update_plan` for complex strategy.\n", | ||
| "**Stop quickly on failure**: if the same tool call fails 2 times in a row, stop retrying and return what you have so far with a one-line note explaining what's missing. Do not loop on impossible queries (e.g. external API unreachable, rate-limited, or returning empty).\n", | ||
| "**Bounded effort**: prefer one focused attempt over many speculative retries. If you cannot complete the task with available data within 3-5 tool calls, return your current partial findings — the parent agent can compensate with its own knowledge.\n\n" |
There was a problem hiding this comment.
The phrase "within 3-5 tool calls" clashes directly with the preceding line that tells the agent to plan multi-step work with
checklist_write. A checklist_write call is itself 1 tool call; every checklist item update is another; any non-trivial legitimate task (read files, make edits, run verification) will routinely exceed 5 tool calls without being "stuck." An LLM reading both instructions together may abandon a healthy, in-progress multi-step task as soon as the counter reaches 5. The intent—give up only when the required data is unavailable—should be spelled out explicitly to avoid this false-positive early stop.
| "**Bounded effort**: prefer one focused attempt over many speculative retries. If you cannot complete the task with available data within 3-5 tool calls, return your current partial findings — the parent agent can compensate with its own knowledge.\n\n" | |
| "**Bounded effort**: prefer one focused attempt over many speculative retries. If critical data remains unavailable despite focused searching (e.g. the required resource does not exist or cannot be fetched), return your current partial findings rather than chasing it indefinitely — the parent agent can compensate with its own knowledge.\n\n" |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| "Stay inside the assigned scope; put adjacent work under RISKS/BLOCKERS.\n", | ||
| "Plan multi-step work with `checklist_write`; add `update_plan` for complex strategy.\n\n" | ||
| "Plan multi-step work with `checklist_write`; add `update_plan` for complex strategy.\n", | ||
| "**Stop quickly on failure**: if the same tool call fails 2 times in a row, stop retrying and return what you have so far with a one-line note explaining what's missing. Do not loop on impossible queries (e.g. external API unreachable, rate-limited, or returning empty).\n", |
There was a problem hiding this comment.
The parenthetical example includes
rate-limited alongside permanently unreachable endpoints, but rate-limiting is a transient condition — it clears in seconds and a single retry after a short pause is standard practice. Treating a 429/rate-limit response the same as a DNS failure and stopping after 2 attempts could cause the agent to abandon otherwise-completable tasks during brief quota windows.
| "**Stop quickly on failure**: if the same tool call fails 2 times in a row, stop retrying and return what you have so far with a one-line note explaining what's missing. Do not loop on impossible queries (e.g. external API unreachable, rate-limited, or returning empty).\n", | |
| "**Stop quickly on failure**: if the same tool call fails 2 times in a row with a persistent error (e.g. external API unreachable, returning empty, or consistently erroring), stop retrying and return what you have so far with a one-line note explaining what's missing. Do not loop on impossible queries.\n", |
Summary
The general-purpose sub-agent intro (
GENERAL_AGENT_INTRO) tells the agent how to plan but never says when to stop. With less capable models this produces a failure mode where the agent retries the same failing tool call (e.g. an unreachable or rate-limited external API) over and over until it hits the elapsed/step ceiling — burning the whole budget and returning nothing useful.This adds two short clauses:
Prompt-only change; no behavioral code paths touched.
Testing
cargo fmt --all -- --checkcargo build -p codewhale-tui(compiles)cargo clippy --workspace --all-targets --all-featurescargo test --workspace --all-featuresChecklist
Greptile Summary
This prompt-only change adds two stopping-condition clauses to
GENERAL_AGENT_INTROto prevent less-capable models from looping endlessly on failing or blocked tool calls.checklist_writemulti-step planning guidance in the same string.Confidence Score: 3/5
The change targets a real failure mode, but the '3-5 tool calls' wording in Bounded effort directly contradicts the existing multi-step checklist guidance in the same prompt string — a general agent working through a legitimate checklist will consistently hit that count and may truncate healthy work.
The bounded-effort clause introduces an ambiguous numerical threshold that a model could reasonably interpret as a hard cap, causing it to abandon legitimately complex tasks that the same prompt elsewhere encourages it to plan with a multi-step checklist. This is a present logic conflict in the instruction set, not a future risk.
crates/tui/src/tools/subagent/mod.rs — specifically the two new lines added to GENERAL_AGENT_INTRO and their interaction with the existing checklist-planning guidance directly above them.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[General Agent Receives Task] --> B{Multi-step?} B -- Yes --> C[checklist_write] C --> D[Execute Tool Calls] D --> E{Same call failed 2x in a row?} E -- Yes --> F[Stop: return partial + one-line note] E -- No --> G{3-5 tool calls reached without complete data?} G -- Yes --> H[Stop: return partial findings] G -- No --> D B -- No --> DReviews (1): Last reviewed commit: "feat(subagent): add stop-on-failure and ..." | Re-trigger Greptile