Skip to content

feat(claude): Claude rate-limit + context window budget indicator#41

Open
heavygee wants to merge 19 commits into
feat/codex-usage-indicator-rebasedfrom
feat/claude-budget-adapter
Open

feat(claude): Claude rate-limit + context window budget indicator#41
heavygee wants to merge 19 commits into
feat/codex-usage-indicator-rebasedfrom
feat/claude-budget-adapter

Conversation

@heavygee

Copy link
Copy Markdown
Owner

Summary

Extends the cross-flavor AgentBudgetState budget gauge shape (introduced in tiann#847) with a Claude-specific adapter. This is a fork-local PR tracking the Claude work; intended to be promoted upstream once tiann#847 lands.

What this adds

  • cli/src/claude/utils/claudeUsage.ts - normalises Claude SDK telemetry (SDKResultMessage, SDKAssistantMessage, rate-limit events) into a ClaudeUsage shape, accumulating totalCostUSD and token counts across turns
  • shared/src/schemas.ts / shared/src/types.ts - ClaudeUsage, ClaudeRateLimit, ClaudeModelUsage types; metadata.claudeUsage field
  • cli/src/claude/claudeRemoteLauncher.ts - hooks updateClaudeUsageMetadata to process SDK messages each turn
  • web/src/components/AssistantChat/claudeBudgetAdapter.ts - maps ClaudeUsageAgentBudgetState; derives effective state from rate-limit type, formats reset times as relative ("in 2h 14m") with absolute tooltip, adds cost caveat tooltip
  • web/src/components/AssistantChat/ComposerButtons.tsx / HappyComposer.tsx / SessionChat.tsx - wires claudeUsage prop so the budget ring renders for Claude sessions alongside Codex

Design

Same single-ring metaphor as Codex: centre number = operational axis (context window %), ring colour = worst constraint across all axes (green → amber → red → blocked). Popover highlights which axis is dominant.

Claude-specific axes:

  • Context window - input_tokens / context_window_tokens
  • Rate limit - per-model input/output/request windows; colour driven by tightest window

Depends on

heavygee/hapi#847 (feat/codex-usage-indicator-rebased) - must merge first; this branch is rebased on top of it.

Test plan

  • bun typecheck - clean
  • bun run test - 997 CLI + 318 hub pass
  • Manual: start a Claude session, open usage popover, verify context %, rate-limit rows, relative reset time, cost caveat tooltip

Made with Cursor

dsus4wang and others added 18 commits June 8, 2026 19:50
Ethan's indicator (tiann#537) was designed for time-window plans
(plus / pro 5h+weekly). On Codex Pro accounts that exhaust the
subscription windows AND any topped-up credits, the app-server emits
rate_limits.primary=null + secondary=null + credits.has_credits=false
+ balance="0", and the indicator silently fell back to context-window-
only - reading "80% context, plenty of room" while the account was
actually blocked.

Extend the data path end-to-end:

shared/schemas - add CodexUsageCreditsSchema (hasCredits / unlimited /
balance) and optional rateLimitReachedType / planType / limitId on
CodexUsageSchema. JSON-only, no SCHEMA_VERSION bump.

cli/codexUsage - normalize credits + reached_type + plan_type +
limit_id from the rate_limits root regardless of whether primary /
secondary are present.

web/codexUsageDisplay - add isCodexUsageBlocked() helper; force ring
to 100% and color red when blocked; render a critical-severity
"Credits" row with $balance + 'subscription / top-up exhausted'
detail; render a critical-severity "Limit Reached" header when
codex sets rate_limit_reached_type. Unlimited credit accounts read
"Unlimited" and stay green.

Covered by 4 new cli tests (premium-credits shape from a real Codex
Pro rollout, plus reached-type + unlimited-credits cases), 3 new web
tests (blocked-state ring forcing, Limit Reached header, unlimited
non-blocking), and 1 new shared schema test.

Co-authored-by: Cursor <cursoragent@cursor.com>
Codex sends 'balance' as a precision-preserving string ('250.0000000000',
'0', '0.0000000000') with no declared unit. The previous render asserted
USD with a $ prefix and dumped the string verbatim, producing the
visually awful '$250.0000000000'.

Credits are an internal billing token per the OpenAI Codex rate card
(https://help.openai.com/en/articles/20001106-codex-rate-card): GPT-5.5
consumes 125 credits per 1M input tokens / 750 per 1M output, and a $5
top-up grants 125 credits (~$0.04/credit, not the $1/credit the prior
comment fabricated). Chatgpt.com's own UI even renders credits and any
USD conversion separately ('246 credits, ~10-62 cloud messages'), so
prefixing $ on the balance is a flat-out type error.

- formatCreditsBalance(): Number-parse + toLocaleString with 2dp cap
  for values >= 1 (4dp for sub-unit balances), trailing zeros trimmed.
  '250.0000000000' -> '250', '12.345' -> '12.35', '0.0000000000' -> '0'.
- Drop the $ prefix; the row label 'Credits' carries the unit.
- isCodexUsageBlocked / exhausted-detail check both now parse balance
  numerically instead of literal-matching '0' / '0.00', so future
  trailing-zero variants ('0.0000000000') cannot slip past.

Adds a 4-case parametrized test covering the real string shapes observed
in the wild plus a decimal-rounding case.

Co-authored-by: Cursor <cursoragent@cursor.com>
Ethan's ring (PR tiann#537) preferred contextWindow.percent and only
fell back to rate-limits when context was absent. Real consequence:
weekly=100% but ctx=80% rendered as a green '80' ring, hiding a
HARD subscription cap behind a SOFT context fill. Then when both
windows AND credits exhausted, the blocked override jumped the ring
to red 100 - so the same circle silently switched semantics from
'context fill' to 'usage exhaustion' mid-session. Operator caught it:
'the circle that WAS showing you context now shows you red 100
meaning no more usage'.

Make the ring mean one thing across all states: 'percent of the
most-pressing limit you're about to hit'.

- New getCodexUsageRing(): max across context + 5h + weekly (blocked
  still forces 100). Returns { percent, axis } so callers can show
  which constraint is in front.
- getCodexUsageRingPercent() kept as a thin wrapper for any future
  callers that only need the number.
- getCodexUsageRingTitle(): axis-aware aria-label + title so hovering
  the ring tells you 'Weekly subscription window 100% used' instead
  of 'Codex usage' regardless of state.
- getCodexUsageRows() marks the dominant row (dominant: true). Popover
  paints a left-accent bar + bolds the matching label so opening it
  immediately answers 'why is the ring at 100?'.
- Ring colour gains an amber intermediate (60-85%) instead of jumping
  straight from blue to red at 85.

Replaces the 'prefers context' + 'falls back to rate-limits' tests with
three new ones covering the bug shape (ctx 80 vs weekly 100), the
inverse (context dominates), and dominant-row marking.

Co-authored-by: Cursor <cursoragent@cursor.com>
…= effective state

Introduces a flavor-agnostic AgentBudgetState shape under shared/ and
refactors the Codex indicator to consume it via a Codex-specific adapter.
This is the seed of the cross-flavor agent budget gauge umbrella
(tiann#846); Claude / Cursor / Gemini adapters can drop in without
touching the renderer.

Why

Co-authored-by: Cursor <cursoragent@cursor.com>
---
The previous ring conflated two questions in one number:

  1. 'How much room for THIS task?' (context fill)
  2. 'Am I about to be blocked?' (rate-limit / credits state)

The number's semantics silently flipped between them based on account
state - context% in the normal case, 100 in the blocked case - so the
same circle meant different things at different times. Worse, a Pro
account with subscription windows at 100% but credits available read
red 100 (technically true: weekly is capped) when the operationally
correct signal was amber (credits cover the overage, user is not
actually blocked).

Design
------
- AgentBudgetState.operationalAxisId picks the always-visible centre
  number (defaults to context for LLM agents). Stays consistent across
  all states; the number no longer changes meaning.
- AgentBudgetState.effective is the per-flavor verdict (green / amber /
  red / blocked) computed by the adapter using its specific blocking
  rules. The renderer paints the ring colour from this; user gets one
  honest 'are you about to be blocked' signal alongside the operational
  centre number.
- Adapter encodes the Codex-Pro covering rule: subscription window at
  cap AND credits > 0 -> amber, not red. The popover marks the credits
  axis as 'covering' with a blue accent so the user can see why.
- Popover renders pressure axes top-to-bottom with the dominant one
  carrying a colour-coded left accent + bold label, then a divider, then
  the informational metadata rows (token breakdown etc).

Generalisation seed
-------------------
- shared/src/agentBudget.ts: flavor-agnostic types
  (AgentBudgetState / Axis / EffectiveState / MetadataRow).
- web/src/components/AssistantChat/AgentBudgetIndicator.tsx: renderer
  that consumes AgentBudgetState. Knows nothing about codex credits or
  claude rate-limit headers - drop in a new adapter and the indicator
  works for that flavor.
- web/src/components/AssistantChat/codexBudgetAdapter.ts:
  toCodexBudgetState() - the only Codex-flavor module under web/. All
  5h / weekly / credits / plan_type semantics live here.

Removes the previous getCodexUsageRing / getCodexUsageRows /
getCodexUsageRingTitle / CodexUsageRing / CodexUsageRow helpers
(replaced by the adapter) plus their test file. 11 new tests in
codexBudgetAdapter.test.ts cover the operator-caught scenarios: weekly
at cap + credits covering -> amber, blocked, unlimited credits, fresh
session before context arrives, etc.
…ession

The SpawnHappySession RPC handler received importHistory from the hub
(hub/src/sync/rpcGateway.ts forwards it) but dropped it before the
spawnSession call - so remote web resumes with 'Import history' checked
executed `hapi codex resume <thread>` without --hapi-import-history and
silently skipped the history import.

Fix: destructure importHistory from params and pass it to spawnSession.
Chain is now complete: web -> hub RPC -> machine handler -> spawnSession
-> buildCliArgs (--hapi-import-history flag).

Fixes bot-review Major finding on tiann#847.

Co-authored-by: Cursor <cursoragent@cursor.com>
…r UX

Blocker: remove replayExistingEvents from local usage scanner.
importCodexSessionHistory() already sends all user/agent messages to
HAPI; setting replayExistingEvents:true on the local scanner caused
every imported message to be sent twice (once via importHistory, once
via the scanner's onEvent -> sendUserMessage/sendAgentMessage path).
The scanner's role in the local launcher is live-tail only.

Minor: AgentBudgetAxisId union - use (string & {}) instead of string
so IDE still completes the well-known ids ('context', 'fiveHour', etc.)
while accepting arbitrary flavor-specific strings at compile time.
Plain string collapsed the union and lost completions.

Minor: AgentBudgetIndicator popover - add pointerdown outside-click and
Escape keyboard handlers so the popover closes without requiring a
second button click.

Nit: formatCodexUsageReset - use explicit `<= 0` guard instead of
falsy `!resetAt` to make the epoch-exclusion intent clear.

Co-authored-by: Cursor <cursoragent@cursor.com>
…pace-root machines

The listCodexSessions RPC handler returned every transcript under
CODEX_HOME without checking workspace roots. On machines started with
--workspace-root, this let the web Codex session picker enumerate
sessions from projects outside the allowed roots - a privacy/scoping
leak analogous to the existing guards on spawn and directory browsing.

Fix: add an optional pathAllowed callback to registerCodexSessionHandlers
and listCodexSessions. apiMachine.ts wires in an isWithinWorkspaceRoots
check when normalizedWorkspaceRoots is set; sessions with a null path
are also excluded when the filter is active. When no workspace roots
are configured, pathAllowed is undefined and the full list is returned
unchanged (same behavior as before for single-machine installs).

Fixes bot-review Major finding on tiann#847 (follow-up review).

Co-authored-by: Cursor <cursoragent@cursor.com>
…esume

When the user picks a Codex session from the history picker the spawn
call still sent the form's directory input as `directory`, so resuming
a session from /repo-a while the input showed /repo-b launched Codex
in the wrong workspace with the wrong files.

Fix: resolve `selectedCodexSession.path` from the sessions list and use
that as `spawnDirectory` when a Codex session is selected. Falls back
to `trimmedDirectory` for plain new sessions.

Also: relax `canCreate` guard to allow spawn when a Codex session is
selected even if the directory input is empty (the path comes from the
session, not the form).

Fixes bot-review Major finding on tiann#847.

Co-authored-by: Cursor <cursoragent@cursor.com>
When a session-updated SSE event carries a metadata patch (e.g. a
Codex history import setting codexSessionId or title), patchSessionSummary
now maps it through toSummaryMetadata so the sessions list cache reflects
name/path/summary/agentSessionId changes immediately.

Previously, patchSessionSummary returned `true` (patch applied) while
silently ignoring the metadata payload, suppressing the
queueSessionListInvalidation() fallback and leaving stale titles/paths
in the list until the next full refetch.

Co-authored-by: Cursor <cursoragent@cursor.com>
Two bot-flagged Majors:

1. replayTranscriptHistoryOnStart: add !opts.importHistory guard so that
   when history is imported via importCodexSessionHistory() the transcript
   is not replayed a second time if the session is later handed off to
   local mode (codexLocalLauncher).

2. Stale Codex session selection: add useEffect in NewSession that clears
   selectedCodexSessionId whenever the selected id is no longer present in
   the filtered sessions list (e.g. after toggling showOldCodexSessions
   off), preventing handleCreate() from silently resuming a hidden thread.

Co-authored-by: Cursor <cursoragent@cursor.com>
useCodexSessions was discarding nextCursor after the first request,
capping the resume/import selector at one page (100 sessions). Replace
the single fetch with a do-while pagination loop so all matching
threads are surfaced regardless of how many the CLI scanner has indexed.

Co-authored-by: Cursor <cursoragent@cursor.com>
…et indicator

Phase B of umbrella tiann#846 (cross-flavor agent budget gauges).
Builds on the Phase A Codex adapter (PR tiann#847) by adding a Claude adapter
that converts the SDK telemetry stream into the same AgentBudgetState
shape, so the existing AgentBudgetIndicator renderer needs no flavor
changes - the only flavor switch is in ComposerButtons.

Telemetry sources (all from @anthropic-ai/claude-code SDK message stream
observed in cli/src/claude/claudeRemoteLauncher.ts):
- rate_limit_event: status / resetsAt / utilization / rateLimitType for
  session_5h, weekly_max, and any future opaque types
- assistant.message.usage: per-turn input + cache_read + cache_creation
  for the context-window gauge
- result.modelUsage[model]: contextWindow + maxOutputTokens (so we never
  need a hard-coded model->window table) and per-turn token totals
- result.total_cost_usd: cumulative session cost

New shared shape (shared/src/schemas.ts):
- ClaudeRateLimitSchema (record over opaque rateLimitType for forward
  compat - no enum churn on new variants)
- ClaudeUsageSchema with contextWindow / rateLimits / modelUsage /
  totalCostUSD / resolvedModel
- Metadata.claudeUsage: optional ClaudeUsageSchema

Wire path:
- claudeRemoteLauncher.onMessage: extractClaudeUsageInput short-circuits
  on non-telemetry messages (text deltas, tool calls) so we don't take
  the metadata lock + socket roundtrip per SDK message
- normalizeClaudeUsage merges patches into session.metadata.claudeUsage
- session.client.updateMetadata broadcasts via the existing versioned
  metadata path - no new SSE events, no schema bump
- toClaudeBudgetState (web) maps into AgentBudgetState; ComposerButtons
  routes by agentFlavor === 'claude'

Effective state rules (Claude-specific - simpler than Codex):
- blocked: any rate limit status === 'rejected'
- red: any axis pressure >= 90
- amber: any axis pressure >= 60
- green: otherwise
(No credit-cover fallback - Claude bills by subscription only.)

Tests:
- cli: 19 tests for extractClaudeUsageInput + normalizeClaudeUsage +
  ingestClaudeSDKMessage
- web: 15 tests for toClaudeBudgetState covering thresholds, dominance,
  rejected state, unknown rate-limit types, metadata rows
- shared: 5 tests for ClaudeUsageSchema + Metadata round-trip

Includes a duplicate of shared/src/agentBudget.ts that also lives in
PR tiann#847; the duplicate is identical content so a merge after tiann#847 lands
is a trivial accept-either resolution. Standalone, this PR can land
before or after tiann#847 with the same outcome.

Co-authored-with-context: original codex indicator authored by
@dsus4wang in tiann#537 (carried forward as PR tiann#847).

Co-authored-by: Cursor <cursoragent@cursor.com>
…reset dates

The old heuristic (resetsAt * 1000 < 1e12) was inverted: current unix
seconds (~1.78e9) * 1000 = ~1.78e12 > 1e12, so the code took the 'treat
as ms' branch and fed ~1.78 billion ms to new Date(), yielding Jan 1970.

Fix: use resetsAt < 1e10 as the seconds-vs-ms threshold. Unix seconds for
the next several decades stay below 2e9 << 1e10; unix ms stay above 1e12.

Also change the reset display from absolute date ('Jan 21, 3:43 PM') to
relative time ('resets in 4h 32m') with the full absolute date surfaced as
a browser tooltip (title attribute) on hover. Adds detailTitle to
AgentBudgetAxis for this purpose and wires it in AgentBudgetIndicator.

Co-authored-by: Cursor <cursoragent@cursor.com>
SDK result.total_cost_usd is per-turn cost, not session-cumulative.
Similarly, result.modelUsage[model].inputTokens/outputTokens/etc are
per-turn counts. The previous implementation did a shallow merge (spread)
which overwrote with the latest turn's values, causing 'Cost (session)'
to actually show the last-turn cost.

Fix: accumulate totalCostUSD and all token counts across result messages.
contextWindow and maxOutputTokens are structural metadata - take latest.

No hub restart needed (CLI-side change only; the hub just stores whatever
claudeUsage arrives via update-metadata).

Co-authored-by: Cursor <cursoragent@cursor.com>
…ed accumulation

Co-authored-by: Cursor <cursoragent@cursor.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Co-authored-by: Cursor <cursoragent@cursor.com>
@heavygee heavygee force-pushed the feat/codex-usage-indicator-rebased branch 3 times, most recently from dc01f15 to 88a740b Compare June 18, 2026 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants