Skip to content

feat(subagent): add stop-on-failure and bounded-effort guidance to general agent intro#2354

Open
h3c-hexin wants to merge 1 commit into
Hmbown:mainfrom
h3c-hexin:pr2/subagent-stop-on-failure
Open

feat(subagent): add stop-on-failure and bounded-effort guidance to general agent intro#2354
h3c-hexin wants to merge 1 commit into
Hmbown:mainfrom
h3c-hexin:pr2/subagent-stop-on-failure

Conversation

@h3c-hexin
Copy link
Copy Markdown
Contributor

@h3c-hexin h3c-hexin commented May 29, 2026

Summary

The general-purpose sub-agent intro (GENERAL_AGENT_INTRO) tells the agent how to plan but never says when to stop. With less capable models this produces a failure mode where the agent retries the same failing tool call (e.g. an unreachable or rate-limited external API) over and over until it hits the elapsed/step ceiling — burning the whole budget and returning nothing useful.

This adds two short clauses:

  • Stop quickly on failure: after the same call fails twice, return partial results with a one-line note instead of looping.
  • Bounded effort: prefer one focused attempt; if the task can't be completed within a few tool calls, return current findings so the parent can compensate.

Prompt-only change; no behavioral code paths touched.

Testing

  • cargo fmt --all -- --check
  • cargo build -p codewhale-tui (compiles)
  • cargo clippy --workspace --all-targets --all-features
  • cargo test --workspace --all-features

Checklist

  • Updated docs or comments as needed
  • Added or updated tests where relevant (prompt-string change, no test target)
  • Verified TUI behavior manually if UI changes (n/a)

Greptile Summary

This prompt-only change adds two stopping-condition clauses to GENERAL_AGENT_INTRO to prevent less-capable models from looping endlessly on failing or blocked tool calls.

  • Stop quickly on failure instructs the agent to halt and return partial results after 2 consecutive failures of the same tool call, citing the missing piece.
  • Bounded effort tells the agent to prefer one focused attempt and return partial findings rather than speculating indefinitely — but the "3-5 tool calls" phrasing creates a numerical cap that conflicts with the existing checklist_write multi-step planning guidance in the same string.

Confidence Score: 3/5

The change targets a real failure mode, but the '3-5 tool calls' wording in Bounded effort directly contradicts the existing multi-step checklist guidance in the same prompt string — a general agent working through a legitimate checklist will consistently hit that count and may truncate healthy work.

The bounded-effort clause introduces an ambiguous numerical threshold that a model could reasonably interpret as a hard cap, causing it to abandon legitimately complex tasks that the same prompt elsewhere encourages it to plan with a multi-step checklist. This is a present logic conflict in the instruction set, not a future risk.

crates/tui/src/tools/subagent/mod.rs — specifically the two new lines added to GENERAL_AGENT_INTRO and their interaction with the existing checklist-planning guidance directly above them.

Important Files Changed

Filename Overview
crates/tui/src/tools/subagent/mod.rs Adds two prompt-only guidance clauses to GENERAL_AGENT_INTRO; the "3-5 tool calls" threshold in the bounded-effort clause directly conflicts with the existing multi-step checklist guidance in the same string.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[General Agent Receives Task] --> B{Multi-step?}
    B -- Yes --> C[checklist_write]
    C --> D[Execute Tool Calls]
    D --> E{Same call failed 2x in a row?}
    E -- Yes --> F[Stop: return partial + one-line note]
    E -- No --> G{3-5 tool calls reached without complete data?}
    G -- Yes --> H[Stop: return partial findings]
    G -- No --> D
    B -- No --> D
Loading

Fix All in Codex Fix All in Claude Code Fix All in Cursor

Reviews (1): Last reviewed commit: "feat(subagent): add stop-on-failure and ..." | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

…neral agent intro

The general-purpose sub-agent intro tells the agent how to plan, but says
nothing about when to *stop*. With less capable models this leads to a
failure mode where the agent retries the same failing tool call (e.g. an
unreachable or rate-limited external API) over and over until it hits the
elapsed/step ceiling, wasting the whole budget and returning nothing useful.

Add two short clauses to GENERAL_AGENT_INTRO:
- Stop quickly on failure: after the same call fails twice, return partial
  results with a one-line note instead of looping.
- Bounded effort: prefer one focused attempt; if the task can't be completed
  within a few tool calls, return current findings so the parent can compensate.

Prompt-only change; no behavioral code paths touched.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the GENERAL_AGENT_INTRO prompt in crates/tui/src/tools/subagent/mod.rs to instruct sub-agents to stop quickly on failure (after 2 consecutive tool call failures) and to bound their effort (preferring 3-5 tool calls and returning partial findings). There are no review comments, and I have no feedback to provide.

"Plan multi-step work with `checklist_write`; add `update_plan` for complex strategy.\n\n"
"Plan multi-step work with `checklist_write`; add `update_plan` for complex strategy.\n",
"**Stop quickly on failure**: if the same tool call fails 2 times in a row, stop retrying and return what you have so far with a one-line note explaining what's missing. Do not loop on impossible queries (e.g. external API unreachable, rate-limited, or returning empty).\n",
"**Bounded effort**: prefer one focused attempt over many speculative retries. If you cannot complete the task with available data within 3-5 tool calls, return your current partial findings — the parent agent can compensate with its own knowledge.\n\n"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The phrase "within 3-5 tool calls" clashes directly with the preceding line that tells the agent to plan multi-step work with checklist_write. A checklist_write call is itself 1 tool call; every checklist item update is another; any non-trivial legitimate task (read files, make edits, run verification) will routinely exceed 5 tool calls without being "stuck." An LLM reading both instructions together may abandon a healthy, in-progress multi-step task as soon as the counter reaches 5. The intent—give up only when the required data is unavailable—should be spelled out explicitly to avoid this false-positive early stop.

Suggested change
"**Bounded effort**: prefer one focused attempt over many speculative retries. If you cannot complete the task with available data within 3-5 tool calls, return your current partial findings — the parent agent can compensate with its own knowledge.\n\n"
"**Bounded effort**: prefer one focused attempt over many speculative retries. If critical data remains unavailable despite focused searching (e.g. the required resource does not exist or cannot be fetched), return your current partial findings rather than chasing it indefinitely — the parent agent can compensate with its own knowledge.\n\n"

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Codex Fix in Claude Code Fix in Cursor

"Stay inside the assigned scope; put adjacent work under RISKS/BLOCKERS.\n",
"Plan multi-step work with `checklist_write`; add `update_plan` for complex strategy.\n\n"
"Plan multi-step work with `checklist_write`; add `update_plan` for complex strategy.\n",
"**Stop quickly on failure**: if the same tool call fails 2 times in a row, stop retrying and return what you have so far with a one-line note explaining what's missing. Do not loop on impossible queries (e.g. external API unreachable, rate-limited, or returning empty).\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The parenthetical example includes rate-limited alongside permanently unreachable endpoints, but rate-limiting is a transient condition — it clears in seconds and a single retry after a short pause is standard practice. Treating a 429/rate-limit response the same as a DNS failure and stopping after 2 attempts could cause the agent to abandon otherwise-completable tasks during brief quota windows.

Suggested change
"**Stop quickly on failure**: if the same tool call fails 2 times in a row, stop retrying and return what you have so far with a one-line note explaining what's missing. Do not loop on impossible queries (e.g. external API unreachable, rate-limited, or returning empty).\n",
"**Stop quickly on failure**: if the same tool call fails 2 times in a row with a persistent error (e.g. external API unreachable, returning empty, or consistently erroring), stop retrying and return what you have so far with a one-line note explaining what's missing. Do not loop on impossible queries.\n",

Fix in Codex Fix in Claude Code Fix in Cursor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant