feat(claude): Claude rate-limit + context window budget indicator#41
Open
heavygee wants to merge 19 commits into
Open
feat(claude): Claude rate-limit + context window budget indicator#41heavygee wants to merge 19 commits into
heavygee wants to merge 19 commits into
Conversation
Ethan's indicator (tiann#537) was designed for time-window plans (plus / pro 5h+weekly). On Codex Pro accounts that exhaust the subscription windows AND any topped-up credits, the app-server emits rate_limits.primary=null + secondary=null + credits.has_credits=false + balance="0", and the indicator silently fell back to context-window- only - reading "80% context, plenty of room" while the account was actually blocked. Extend the data path end-to-end: shared/schemas - add CodexUsageCreditsSchema (hasCredits / unlimited / balance) and optional rateLimitReachedType / planType / limitId on CodexUsageSchema. JSON-only, no SCHEMA_VERSION bump. cli/codexUsage - normalize credits + reached_type + plan_type + limit_id from the rate_limits root regardless of whether primary / secondary are present. web/codexUsageDisplay - add isCodexUsageBlocked() helper; force ring to 100% and color red when blocked; render a critical-severity "Credits" row with $balance + 'subscription / top-up exhausted' detail; render a critical-severity "Limit Reached" header when codex sets rate_limit_reached_type. Unlimited credit accounts read "Unlimited" and stay green. Covered by 4 new cli tests (premium-credits shape from a real Codex Pro rollout, plus reached-type + unlimited-credits cases), 3 new web tests (blocked-state ring forcing, Limit Reached header, unlimited non-blocking), and 1 new shared schema test. Co-authored-by: Cursor <cursoragent@cursor.com>
Codex sends 'balance' as a precision-preserving string ('250.0000000000',
'0', '0.0000000000') with no declared unit. The previous render asserted
USD with a $ prefix and dumped the string verbatim, producing the
visually awful '$250.0000000000'.
Credits are an internal billing token per the OpenAI Codex rate card
(https://help.openai.com/en/articles/20001106-codex-rate-card): GPT-5.5
consumes 125 credits per 1M input tokens / 750 per 1M output, and a $5
top-up grants 125 credits (~$0.04/credit, not the $1/credit the prior
comment fabricated). Chatgpt.com's own UI even renders credits and any
USD conversion separately ('246 credits, ~10-62 cloud messages'), so
prefixing $ on the balance is a flat-out type error.
- formatCreditsBalance(): Number-parse + toLocaleString with 2dp cap
for values >= 1 (4dp for sub-unit balances), trailing zeros trimmed.
'250.0000000000' -> '250', '12.345' -> '12.35', '0.0000000000' -> '0'.
- Drop the $ prefix; the row label 'Credits' carries the unit.
- isCodexUsageBlocked / exhausted-detail check both now parse balance
numerically instead of literal-matching '0' / '0.00', so future
trailing-zero variants ('0.0000000000') cannot slip past.
Adds a 4-case parametrized test covering the real string shapes observed
in the wild plus a decimal-rounding case.
Co-authored-by: Cursor <cursoragent@cursor.com>
Ethan's ring (PR tiann#537) preferred contextWindow.percent and only fell back to rate-limits when context was absent. Real consequence: weekly=100% but ctx=80% rendered as a green '80' ring, hiding a HARD subscription cap behind a SOFT context fill. Then when both windows AND credits exhausted, the blocked override jumped the ring to red 100 - so the same circle silently switched semantics from 'context fill' to 'usage exhaustion' mid-session. Operator caught it: 'the circle that WAS showing you context now shows you red 100 meaning no more usage'. Make the ring mean one thing across all states: 'percent of the most-pressing limit you're about to hit'. - New getCodexUsageRing(): max across context + 5h + weekly (blocked still forces 100). Returns { percent, axis } so callers can show which constraint is in front. - getCodexUsageRingPercent() kept as a thin wrapper for any future callers that only need the number. - getCodexUsageRingTitle(): axis-aware aria-label + title so hovering the ring tells you 'Weekly subscription window 100% used' instead of 'Codex usage' regardless of state. - getCodexUsageRows() marks the dominant row (dominant: true). Popover paints a left-accent bar + bolds the matching label so opening it immediately answers 'why is the ring at 100?'. - Ring colour gains an amber intermediate (60-85%) instead of jumping straight from blue to red at 85. Replaces the 'prefers context' + 'falls back to rate-limits' tests with three new ones covering the bug shape (ctx 80 vs weekly 100), the inverse (context dominates), and dominant-row marking. Co-authored-by: Cursor <cursoragent@cursor.com>
…= effective state Introduces a flavor-agnostic AgentBudgetState shape under shared/ and refactors the Codex indicator to consume it via a Codex-specific adapter. This is the seed of the cross-flavor agent budget gauge umbrella (tiann#846); Claude / Cursor / Gemini adapters can drop in without touching the renderer. Why Co-authored-by: Cursor <cursoragent@cursor.com> --- The previous ring conflated two questions in one number: 1. 'How much room for THIS task?' (context fill) 2. 'Am I about to be blocked?' (rate-limit / credits state) The number's semantics silently flipped between them based on account state - context% in the normal case, 100 in the blocked case - so the same circle meant different things at different times. Worse, a Pro account with subscription windows at 100% but credits available read red 100 (technically true: weekly is capped) when the operationally correct signal was amber (credits cover the overage, user is not actually blocked). Design ------ - AgentBudgetState.operationalAxisId picks the always-visible centre number (defaults to context for LLM agents). Stays consistent across all states; the number no longer changes meaning. - AgentBudgetState.effective is the per-flavor verdict (green / amber / red / blocked) computed by the adapter using its specific blocking rules. The renderer paints the ring colour from this; user gets one honest 'are you about to be blocked' signal alongside the operational centre number. - Adapter encodes the Codex-Pro covering rule: subscription window at cap AND credits > 0 -> amber, not red. The popover marks the credits axis as 'covering' with a blue accent so the user can see why. - Popover renders pressure axes top-to-bottom with the dominant one carrying a colour-coded left accent + bold label, then a divider, then the informational metadata rows (token breakdown etc). Generalisation seed ------------------- - shared/src/agentBudget.ts: flavor-agnostic types (AgentBudgetState / Axis / EffectiveState / MetadataRow). - web/src/components/AssistantChat/AgentBudgetIndicator.tsx: renderer that consumes AgentBudgetState. Knows nothing about codex credits or claude rate-limit headers - drop in a new adapter and the indicator works for that flavor. - web/src/components/AssistantChat/codexBudgetAdapter.ts: toCodexBudgetState() - the only Codex-flavor module under web/. All 5h / weekly / credits / plan_type semantics live here. Removes the previous getCodexUsageRing / getCodexUsageRows / getCodexUsageRingTitle / CodexUsageRing / CodexUsageRow helpers (replaced by the adapter) plus their test file. 11 new tests in codexBudgetAdapter.test.ts cover the operator-caught scenarios: weekly at cap + credits covering -> amber, blocked, unlimited credits, fresh session before context arrives, etc.
…ession The SpawnHappySession RPC handler received importHistory from the hub (hub/src/sync/rpcGateway.ts forwards it) but dropped it before the spawnSession call - so remote web resumes with 'Import history' checked executed `hapi codex resume <thread>` without --hapi-import-history and silently skipped the history import. Fix: destructure importHistory from params and pass it to spawnSession. Chain is now complete: web -> hub RPC -> machine handler -> spawnSession -> buildCliArgs (--hapi-import-history flag). Fixes bot-review Major finding on tiann#847. Co-authored-by: Cursor <cursoragent@cursor.com>
…r UX
Blocker: remove replayExistingEvents from local usage scanner.
importCodexSessionHistory() already sends all user/agent messages to
HAPI; setting replayExistingEvents:true on the local scanner caused
every imported message to be sent twice (once via importHistory, once
via the scanner's onEvent -> sendUserMessage/sendAgentMessage path).
The scanner's role in the local launcher is live-tail only.
Minor: AgentBudgetAxisId union - use (string & {}) instead of string
so IDE still completes the well-known ids ('context', 'fiveHour', etc.)
while accepting arbitrary flavor-specific strings at compile time.
Plain string collapsed the union and lost completions.
Minor: AgentBudgetIndicator popover - add pointerdown outside-click and
Escape keyboard handlers so the popover closes without requiring a
second button click.
Nit: formatCodexUsageReset - use explicit `<= 0` guard instead of
falsy `!resetAt` to make the epoch-exclusion intent clear.
Co-authored-by: Cursor <cursoragent@cursor.com>
…pace-root machines The listCodexSessions RPC handler returned every transcript under CODEX_HOME without checking workspace roots. On machines started with --workspace-root, this let the web Codex session picker enumerate sessions from projects outside the allowed roots - a privacy/scoping leak analogous to the existing guards on spawn and directory browsing. Fix: add an optional pathAllowed callback to registerCodexSessionHandlers and listCodexSessions. apiMachine.ts wires in an isWithinWorkspaceRoots check when normalizedWorkspaceRoots is set; sessions with a null path are also excluded when the filter is active. When no workspace roots are configured, pathAllowed is undefined and the full list is returned unchanged (same behavior as before for single-machine installs). Fixes bot-review Major finding on tiann#847 (follow-up review). Co-authored-by: Cursor <cursoragent@cursor.com>
…esume When the user picks a Codex session from the history picker the spawn call still sent the form's directory input as `directory`, so resuming a session from /repo-a while the input showed /repo-b launched Codex in the wrong workspace with the wrong files. Fix: resolve `selectedCodexSession.path` from the sessions list and use that as `spawnDirectory` when a Codex session is selected. Falls back to `trimmedDirectory` for plain new sessions. Also: relax `canCreate` guard to allow spawn when a Codex session is selected even if the directory input is empty (the path comes from the session, not the form). Fixes bot-review Major finding on tiann#847. Co-authored-by: Cursor <cursoragent@cursor.com>
When a session-updated SSE event carries a metadata patch (e.g. a Codex history import setting codexSessionId or title), patchSessionSummary now maps it through toSummaryMetadata so the sessions list cache reflects name/path/summary/agentSessionId changes immediately. Previously, patchSessionSummary returned `true` (patch applied) while silently ignoring the metadata payload, suppressing the queueSessionListInvalidation() fallback and leaving stale titles/paths in the list until the next full refetch. Co-authored-by: Cursor <cursoragent@cursor.com>
Two bot-flagged Majors: 1. replayTranscriptHistoryOnStart: add !opts.importHistory guard so that when history is imported via importCodexSessionHistory() the transcript is not replayed a second time if the session is later handed off to local mode (codexLocalLauncher). 2. Stale Codex session selection: add useEffect in NewSession that clears selectedCodexSessionId whenever the selected id is no longer present in the filtered sessions list (e.g. after toggling showOldCodexSessions off), preventing handleCreate() from silently resuming a hidden thread. Co-authored-by: Cursor <cursoragent@cursor.com>
useCodexSessions was discarding nextCursor after the first request, capping the resume/import selector at one page (100 sessions). Replace the single fetch with a do-while pagination loop so all matching threads are surfaced regardless of how many the CLI scanner has indexed. Co-authored-by: Cursor <cursoragent@cursor.com>
…et indicator Phase B of umbrella tiann#846 (cross-flavor agent budget gauges). Builds on the Phase A Codex adapter (PR tiann#847) by adding a Claude adapter that converts the SDK telemetry stream into the same AgentBudgetState shape, so the existing AgentBudgetIndicator renderer needs no flavor changes - the only flavor switch is in ComposerButtons. Telemetry sources (all from @anthropic-ai/claude-code SDK message stream observed in cli/src/claude/claudeRemoteLauncher.ts): - rate_limit_event: status / resetsAt / utilization / rateLimitType for session_5h, weekly_max, and any future opaque types - assistant.message.usage: per-turn input + cache_read + cache_creation for the context-window gauge - result.modelUsage[model]: contextWindow + maxOutputTokens (so we never need a hard-coded model->window table) and per-turn token totals - result.total_cost_usd: cumulative session cost New shared shape (shared/src/schemas.ts): - ClaudeRateLimitSchema (record over opaque rateLimitType for forward compat - no enum churn on new variants) - ClaudeUsageSchema with contextWindow / rateLimits / modelUsage / totalCostUSD / resolvedModel - Metadata.claudeUsage: optional ClaudeUsageSchema Wire path: - claudeRemoteLauncher.onMessage: extractClaudeUsageInput short-circuits on non-telemetry messages (text deltas, tool calls) so we don't take the metadata lock + socket roundtrip per SDK message - normalizeClaudeUsage merges patches into session.metadata.claudeUsage - session.client.updateMetadata broadcasts via the existing versioned metadata path - no new SSE events, no schema bump - toClaudeBudgetState (web) maps into AgentBudgetState; ComposerButtons routes by agentFlavor === 'claude' Effective state rules (Claude-specific - simpler than Codex): - blocked: any rate limit status === 'rejected' - red: any axis pressure >= 90 - amber: any axis pressure >= 60 - green: otherwise (No credit-cover fallback - Claude bills by subscription only.) Tests: - cli: 19 tests for extractClaudeUsageInput + normalizeClaudeUsage + ingestClaudeSDKMessage - web: 15 tests for toClaudeBudgetState covering thresholds, dominance, rejected state, unknown rate-limit types, metadata rows - shared: 5 tests for ClaudeUsageSchema + Metadata round-trip Includes a duplicate of shared/src/agentBudget.ts that also lives in PR tiann#847; the duplicate is identical content so a merge after tiann#847 lands is a trivial accept-either resolution. Standalone, this PR can land before or after tiann#847 with the same outcome. Co-authored-with-context: original codex indicator authored by @dsus4wang in tiann#537 (carried forward as PR tiann#847). Co-authored-by: Cursor <cursoragent@cursor.com>
…reset dates
The old heuristic (resetsAt * 1000 < 1e12) was inverted: current unix
seconds (~1.78e9) * 1000 = ~1.78e12 > 1e12, so the code took the 'treat
as ms' branch and fed ~1.78 billion ms to new Date(), yielding Jan 1970.
Fix: use resetsAt < 1e10 as the seconds-vs-ms threshold. Unix seconds for
the next several decades stay below 2e9 << 1e10; unix ms stay above 1e12.
Also change the reset display from absolute date ('Jan 21, 3:43 PM') to
relative time ('resets in 4h 32m') with the full absolute date surfaced as
a browser tooltip (title attribute) on hover. Adds detailTitle to
AgentBudgetAxis for this purpose and wires it in AgentBudgetIndicator.
Co-authored-by: Cursor <cursoragent@cursor.com>
SDK result.total_cost_usd is per-turn cost, not session-cumulative. Similarly, result.modelUsage[model].inputTokens/outputTokens/etc are per-turn counts. The previous implementation did a shallow merge (spread) which overwrote with the latest turn's values, causing 'Cost (session)' to actually show the last-turn cost. Fix: accumulate totalCostUSD and all token counts across result messages. contextWindow and maxOutputTokens are structural metadata - take latest. No hub restart needed (CLI-side change only; the hub just stores whatever claudeUsage arrives via update-metadata). Co-authored-by: Cursor <cursoragent@cursor.com>
…ed accumulation Co-authored-by: Cursor <cursoragent@cursor.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Co-authored-by: Cursor <cursoragent@cursor.com>
dc01f15 to
88a740b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Extends the cross-flavor
AgentBudgetStatebudget gauge shape (introduced in tiann#847) with a Claude-specific adapter. This is a fork-local PR tracking the Claude work; intended to be promoted upstream once tiann#847 lands.What this adds
cli/src/claude/utils/claudeUsage.ts- normalises Claude SDK telemetry (SDKResultMessage,SDKAssistantMessage, rate-limit events) into aClaudeUsageshape, accumulatingtotalCostUSDand token counts across turnsshared/src/schemas.ts/shared/src/types.ts-ClaudeUsage,ClaudeRateLimit,ClaudeModelUsagetypes;metadata.claudeUsagefieldcli/src/claude/claudeRemoteLauncher.ts- hooksupdateClaudeUsageMetadatato process SDK messages each turnweb/src/components/AssistantChat/claudeBudgetAdapter.ts- mapsClaudeUsage→AgentBudgetState; derives effective state from rate-limit type, formats reset times as relative ("in 2h 14m") with absolute tooltip, adds cost caveat tooltipweb/src/components/AssistantChat/ComposerButtons.tsx/HappyComposer.tsx/SessionChat.tsx- wiresclaudeUsageprop so the budget ring renders for Claude sessions alongside CodexDesign
Same single-ring metaphor as Codex: centre number = operational axis (context window %), ring colour = worst constraint across all axes (green → amber → red → blocked). Popover highlights which axis is dominant.
Claude-specific axes:
input_tokens / context_window_tokensDepends on
heavygee/hapi#847 (
feat/codex-usage-indicator-rebased) - must merge first; this branch is rebased on top of it.Test plan
bun typecheck- cleanbun run test- 997 CLI + 318 hub passMade with Cursor