feat(claude): Claude rate-limit + context window budget indicator by heavygee · Pull Request #41 · heavygee/hapi

heavygee · 2026-06-12T21:04:46Z

Summary

Extends the cross-flavor AgentBudgetState budget gauge shape (introduced in tiann#847) with a Claude-specific adapter. This is a fork-local PR tracking the Claude work; intended to be promoted upstream once tiann#847 lands.

What this adds

cli/src/claude/utils/claudeUsage.ts - normalises Claude SDK telemetry (SDKResultMessage, SDKAssistantMessage, rate-limit events) into a ClaudeUsage shape, accumulating totalCostUSD and token counts across turns
shared/src/schemas.ts / shared/src/types.ts - ClaudeUsage, ClaudeRateLimit, ClaudeModelUsage types; metadata.claudeUsage field
cli/src/claude/claudeRemoteLauncher.ts - hooks updateClaudeUsageMetadata to process SDK messages each turn
web/src/components/AssistantChat/claudeBudgetAdapter.ts - maps ClaudeUsage → AgentBudgetState; derives effective state from rate-limit type, formats reset times as relative ("in 2h 14m") with absolute tooltip, adds cost caveat tooltip
web/src/components/AssistantChat/ComposerButtons.tsx / HappyComposer.tsx / SessionChat.tsx - wires claudeUsage prop so the budget ring renders for Claude sessions alongside Codex

Design

Same single-ring metaphor as Codex: centre number = operational axis (context window %), ring colour = worst constraint across all axes (green → amber → red → blocked). Popover highlights which axis is dominant.

Claude-specific axes:

Context window - input_tokens / context_window_tokens
Rate limit - per-model input/output/request windows; colour driven by tightest window

Depends on

heavygee/hapi#847 (feat/codex-usage-indicator-rebased) - must merge first; this branch is rebased on top of it.

Test plan

bun typecheck - clean
bun run test - 997 CLI + 318 hub pass
Manual: start a Claude session, open usage popover, verify context %, rate-limit rows, relative reset time, cost caveat tooltip

Made with Cursor

Ethan's indicator (tiann#537) was designed for time-window plans (plus / pro 5h+weekly). On Codex Pro accounts that exhaust the subscription windows AND any topped-up credits, the app-server emits rate_limits.primary=null + secondary=null + credits.has_credits=false + balance="0", and the indicator silently fell back to context-window- only - reading "80% context, plenty of room" while the account was actually blocked. Extend the data path end-to-end: shared/schemas - add CodexUsageCreditsSchema (hasCredits / unlimited / balance) and optional rateLimitReachedType / planType / limitId on CodexUsageSchema. JSON-only, no SCHEMA_VERSION bump. cli/codexUsage - normalize credits + reached_type + plan_type + limit_id from the rate_limits root regardless of whether primary / secondary are present. web/codexUsageDisplay - add isCodexUsageBlocked() helper; force ring to 100% and color red when blocked; render a critical-severity "Credits" row with $balance + 'subscription / top-up exhausted' detail; render a critical-severity "Limit Reached" header when codex sets rate_limit_reached_type. Unlimited credit accounts read "Unlimited" and stay green. Covered by 4 new cli tests (premium-credits shape from a real Codex Pro rollout, plus reached-type + unlimited-credits cases), 3 new web tests (blocked-state ring forcing, Limit Reached header, unlimited non-blocking), and 1 new shared schema test. Co-authored-by: Cursor <cursoragent@cursor.com>

Codex sends 'balance' as a precision-preserving string ('250.0000000000', '0', '0.0000000000') with no declared unit. The previous render asserted USD with a $ prefix and dumped the string verbatim, producing the visually awful '$250.0000000000'. Credits are an internal billing token per the OpenAI Codex rate card (https://help.openai.com/en/articles/20001106-codex-rate-card): GPT-5.5 consumes 125 credits per 1M input tokens / 750 per 1M output, and a $5 top-up grants 125 credits (~$0.04/credit, not the $1/credit the prior comment fabricated). Chatgpt.com's own UI even renders credits and any USD conversion separately ('246 credits, ~10-62 cloud messages'), so prefixing $ on the balance is a flat-out type error. - formatCreditsBalance(): Number-parse + toLocaleString with 2dp cap for values >= 1 (4dp for sub-unit balances), trailing zeros trimmed. '250.0000000000' -> '250', '12.345' -> '12.35', '0.0000000000' -> '0'. - Drop the $ prefix; the row label 'Credits' carries the unit. - isCodexUsageBlocked / exhausted-detail check both now parse balance numerically instead of literal-matching '0' / '0.00', so future trailing-zero variants ('0.0000000000') cannot slip past. Adds a 4-case parametrized test covering the real string shapes observed in the wild plus a decimal-rounding case. Co-authored-by: Cursor <cursoragent@cursor.com>

Ethan's ring (PR tiann#537) preferred contextWindow.percent and only fell back to rate-limits when context was absent. Real consequence: weekly=100% but ctx=80% rendered as a green '80' ring, hiding a HARD subscription cap behind a SOFT context fill. Then when both windows AND credits exhausted, the blocked override jumped the ring to red 100 - so the same circle silently switched semantics from 'context fill' to 'usage exhaustion' mid-session. Operator caught it: 'the circle that WAS showing you context now shows you red 100 meaning no more usage'. Make the ring mean one thing across all states: 'percent of the most-pressing limit you're about to hit'. - New getCodexUsageRing(): max across context + 5h + weekly (blocked still forces 100). Returns { percent, axis } so callers can show which constraint is in front. - getCodexUsageRingPercent() kept as a thin wrapper for any future callers that only need the number. - getCodexUsageRingTitle(): axis-aware aria-label + title so hovering the ring tells you 'Weekly subscription window 100% used' instead of 'Codex usage' regardless of state. - getCodexUsageRows() marks the dominant row (dominant: true). Popover paints a left-accent bar + bolds the matching label so opening it immediately answers 'why is the ring at 100?'. - Ring colour gains an amber intermediate (60-85%) instead of jumping straight from blue to red at 85. Replaces the 'prefers context' + 'falls back to rate-limits' tests with three new ones covering the bug shape (ctx 80 vs weekly 100), the inverse (context dominates), and dominant-row marking. Co-authored-by: Cursor <cursoragent@cursor.com>

…= effective state Introduces a flavor-agnostic AgentBudgetState shape under shared/ and refactors the Codex indicator to consume it via a Codex-specific adapter. This is the seed of the cross-flavor agent budget gauge umbrella (tiann#846); Claude / Cursor / Gemini adapters can drop in without touching the renderer. Why Co-authored-by: Cursor <cursoragent@cursor.com> --- The previous ring conflated two questions in one number: 1. 'How much room for THIS task?' (context fill) 2. 'Am I about to be blocked?' (rate-limit / credits state) The number's semantics silently flipped between them based on account state - context% in the normal case, 100 in the blocked case - so the same circle meant different things at different times. Worse, a Pro account with subscription windows at 100% but credits available read red 100 (technically true: weekly is capped) when the operationally correct signal was amber (credits cover the overage, user is not actually blocked). Design ------ - AgentBudgetState.operationalAxisId picks the always-visible centre number (defaults to context for LLM agents). Stays consistent across all states; the number no longer changes meaning. - AgentBudgetState.effective is the per-flavor verdict (green / amber / red / blocked) computed by the adapter using its specific blocking rules. The renderer paints the ring colour from this; user gets one honest 'are you about to be blocked' signal alongside the operational centre number. - Adapter encodes the Codex-Pro covering rule: subscription window at cap AND credits > 0 -> amber, not red. The popover marks the credits axis as 'covering' with a blue accent so the user can see why. - Popover renders pressure axes top-to-bottom with the dominant one carrying a colour-coded left accent + bold label, then a divider, then the informational metadata rows (token breakdown etc). Generalisation seed ------------------- - shared/src/agentBudget.ts: flavor-agnostic types (AgentBudgetState / Axis / EffectiveState / MetadataRow). - web/src/components/AssistantChat/AgentBudgetIndicator.tsx: renderer that consumes AgentBudgetState. Knows nothing about codex credits or claude rate-limit headers - drop in a new adapter and the indicator works for that flavor. - web/src/components/AssistantChat/codexBudgetAdapter.ts: toCodexBudgetState() - the only Codex-flavor module under web/. All 5h / weekly / credits / plan_type semantics live here. Removes the previous getCodexUsageRing / getCodexUsageRows / getCodexUsageRingTitle / CodexUsageRing / CodexUsageRow helpers (replaced by the adapter) plus their test file. 11 new tests in codexBudgetAdapter.test.ts cover the operator-caught scenarios: weekly at cap + credits covering -> amber, blocked, unlimited credits, fresh session before context arrives, etc.

…ession The SpawnHappySession RPC handler received importHistory from the hub (hub/src/sync/rpcGateway.ts forwards it) but dropped it before the spawnSession call - so remote web resumes with 'Import history' checked executed `hapi codex resume <thread>` without --hapi-import-history and silently skipped the history import. Fix: destructure importHistory from params and pass it to spawnSession. Chain is now complete: web -> hub RPC -> machine handler -> spawnSession -> buildCliArgs (--hapi-import-history flag). Fixes bot-review Major finding on tiann#847. Co-authored-by: Cursor <cursoragent@cursor.com>

…r UX Blocker: remove replayExistingEvents from local usage scanner. importCodexSessionHistory() already sends all user/agent messages to HAPI; setting replayExistingEvents:true on the local scanner caused every imported message to be sent twice (once via importHistory, once via the scanner's onEvent -> sendUserMessage/sendAgentMessage path). The scanner's role in the local launcher is live-tail only. Minor: AgentBudgetAxisId union - use (string & {}) instead of string so IDE still completes the well-known ids ('context', 'fiveHour', etc.) while accepting arbitrary flavor-specific strings at compile time. Plain string collapsed the union and lost completions. Minor: AgentBudgetIndicator popover - add pointerdown outside-click and Escape keyboard handlers so the popover closes without requiring a second button click. Nit: formatCodexUsageReset - use explicit `<= 0` guard instead of falsy `!resetAt` to make the epoch-exclusion intent clear. Co-authored-by: Cursor <cursoragent@cursor.com>

…pace-root machines The listCodexSessions RPC handler returned every transcript under CODEX_HOME without checking workspace roots. On machines started with --workspace-root, this let the web Codex session picker enumerate sessions from projects outside the allowed roots - a privacy/scoping leak analogous to the existing guards on spawn and directory browsing. Fix: add an optional pathAllowed callback to registerCodexSessionHandlers and listCodexSessions. apiMachine.ts wires in an isWithinWorkspaceRoots check when normalizedWorkspaceRoots is set; sessions with a null path are also excluded when the filter is active. When no workspace roots are configured, pathAllowed is undefined and the full list is returned unchanged (same behavior as before for single-machine installs). Fixes bot-review Major finding on tiann#847 (follow-up review). Co-authored-by: Cursor <cursoragent@cursor.com>

…esume When the user picks a Codex session from the history picker the spawn call still sent the form's directory input as `directory`, so resuming a session from /repo-a while the input showed /repo-b launched Codex in the wrong workspace with the wrong files. Fix: resolve `selectedCodexSession.path` from the sessions list and use that as `spawnDirectory` when a Codex session is selected. Falls back to `trimmedDirectory` for plain new sessions. Also: relax `canCreate` guard to allow spawn when a Codex session is selected even if the directory input is empty (the path comes from the session, not the form). Fixes bot-review Major finding on tiann#847. Co-authored-by: Cursor <cursoragent@cursor.com>

When a session-updated SSE event carries a metadata patch (e.g. a Codex history import setting codexSessionId or title), patchSessionSummary now maps it through toSummaryMetadata so the sessions list cache reflects name/path/summary/agentSessionId changes immediately. Previously, patchSessionSummary returned `true` (patch applied) while silently ignoring the metadata payload, suppressing the queueSessionListInvalidation() fallback and leaving stale titles/paths in the list until the next full refetch. Co-authored-by: Cursor <cursoragent@cursor.com>

Two bot-flagged Majors: 1. replayTranscriptHistoryOnStart: add !opts.importHistory guard so that when history is imported via importCodexSessionHistory() the transcript is not replayed a second time if the session is later handed off to local mode (codexLocalLauncher). 2. Stale Codex session selection: add useEffect in NewSession that clears selectedCodexSessionId whenever the selected id is no longer present in the filtered sessions list (e.g. after toggling showOldCodexSessions off), preventing handleCreate() from silently resuming a hidden thread. Co-authored-by: Cursor <cursoragent@cursor.com>

useCodexSessions was discarding nextCursor after the first request, capping the resume/import selector at one page (100 sessions). Replace the single fetch with a do-while pagination loop so all matching threads are surfaced regardless of how many the CLI scanner has indexed. Co-authored-by: Cursor <cursoragent@cursor.com>

@dsus4wang

…et indicator Phase B of umbrella tiann#846 (cross-flavor agent budget gauges). Builds on the Phase A Codex adapter (PR tiann#847) by adding a Claude adapter that converts the SDK telemetry stream into the same AgentBudgetState shape, so the existing AgentBudgetIndicator renderer needs no flavor changes - the only flavor switch is in ComposerButtons. Telemetry sources (all from @anthropic-ai/claude-code SDK message stream observed in cli/src/claude/claudeRemoteLauncher.ts): - rate_limit_event: status / resetsAt / utilization / rateLimitType for session_5h, weekly_max, and any future opaque types - assistant.message.usage: per-turn input + cache_read + cache_creation for the context-window gauge - result.modelUsage[model]: contextWindow + maxOutputTokens (so we never need a hard-coded model->window table) and per-turn token totals - result.total_cost_usd: cumulative session cost New shared shape (shared/src/schemas.ts): - ClaudeRateLimitSchema (record over opaque rateLimitType for forward compat - no enum churn on new variants) - ClaudeUsageSchema with contextWindow / rateLimits / modelUsage / totalCostUSD / resolvedModel - Metadata.claudeUsage: optional ClaudeUsageSchema Wire path: - claudeRemoteLauncher.onMessage: extractClaudeUsageInput short-circuits on non-telemetry messages (text deltas, tool calls) so we don't take the metadata lock + socket roundtrip per SDK message - normalizeClaudeUsage merges patches into session.metadata.claudeUsage - session.client.updateMetadata broadcasts via the existing versioned metadata path - no new SSE events, no schema bump - toClaudeBudgetState (web) maps into AgentBudgetState; ComposerButtons routes by agentFlavor === 'claude' Effective state rules (Claude-specific - simpler than Codex): - blocked: any rate limit status === 'rejected' - red: any axis pressure >= 90 - amber: any axis pressure >= 60 - green: otherwise (No credit-cover fallback - Claude bills by subscription only.) Tests: - cli: 19 tests for extractClaudeUsageInput + normalizeClaudeUsage + ingestClaudeSDKMessage - web: 15 tests for toClaudeBudgetState covering thresholds, dominance, rejected state, unknown rate-limit types, metadata rows - shared: 5 tests for ClaudeUsageSchema + Metadata round-trip Includes a duplicate of shared/src/agentBudget.ts that also lives in PR tiann#847; the duplicate is identical content so a merge after tiann#847 lands is a trivial accept-either resolution. Standalone, this PR can land before or after tiann#847 with the same outcome. Co-authored-with-context: original codex indicator authored by @dsus4wang in tiann#537 (carried forward as PR tiann#847). Co-authored-by: Cursor <cursoragent@cursor.com>

…reset dates The old heuristic (resetsAt * 1000 < 1e12) was inverted: current unix seconds (~1.78e9) * 1000 = ~1.78e12 > 1e12, so the code took the 'treat as ms' branch and fed ~1.78 billion ms to new Date(), yielding Jan 1970. Fix: use resetsAt < 1e10 as the seconds-vs-ms threshold. Unix seconds for the next several decades stay below 2e9 << 1e10; unix ms stay above 1e12. Also change the reset display from absolute date ('Jan 21, 3:43 PM') to relative time ('resets in 4h 32m') with the full absolute date surfaced as a browser tooltip (title attribute) on hover. Adds detailTitle to AgentBudgetAxis for this purpose and wires it in AgentBudgetIndicator. Co-authored-by: Cursor <cursoragent@cursor.com>

SDK result.total_cost_usd is per-turn cost, not session-cumulative. Similarly, result.modelUsage[model].inputTokens/outputTokens/etc are per-turn counts. The previous implementation did a shallow merge (spread) which overwrote with the latest turn's values, causing 'Cost (session)' to actually show the last-turn cost. Fix: accumulate totalCostUSD and all token counts across result messages. contextWindow and maxOutputTokens are structural metadata - take latest. No hub restart needed (CLI-side change only; the hub just stores whatever claudeUsage arrives via update-metadata). Co-authored-by: Cursor <cursoragent@cursor.com>

…ed accumulation Co-authored-by: Cursor <cursoragent@cursor.com>

chatgpt-codex-connector · 2026-06-12T21:04:50Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Co-authored-by: Cursor <cursoragent@cursor.com>

dsus4wang and others added 18 commits June 8, 2026 19:50

Add Codex session list/resume flow to web new-session UI

8f5f164

restore Codex session history

3f4f33d

add Codex usage indicator

2f32998

feat(claude-indicator): tooltip on Cost row clarifies connection-scop…

4adfe79

…ed accumulation Co-authored-by: Cursor <cursoragent@cursor.com>

fix(claude-indicator): round cost to 2dp ($10.11 not $10.1073)

b9b139f

Co-authored-by: Cursor <cursoragent@cursor.com>

heavygee force-pushed the feat/codex-usage-indicator-rebased branch 3 times, most recently from dc01f15 to 88a740b Compare June 18, 2026 16:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(claude): Claude rate-limit + context window budget indicator#41

feat(claude): Claude rate-limit + context window budget indicator#41
heavygee wants to merge 19 commits into
feat/codex-usage-indicator-rebasedfrom
feat/claude-budget-adapter

heavygee commented Jun 12, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

heavygee commented Jun 12, 2026

Summary

What this adds

Design

Depends on

Test plan

Uh oh!

chatgpt-codex-connector Bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants