fix: address bugs found in a multi-agent codebase audit#481
Conversation
Five parallel review agents each swept a different area (self-update, TUI keybindings, agent/session handling, sandbox/shell safety, and auth/credential storage) and surfaced nine concrete defects, all fixed here with regression tests: - sandbox: grant_scope path comparisons were case-sensitive, letting a persisted deny grant be bypassed by spelling a Windows path with different case - securefile/oauth: createSecretFile's Windows ERROR_ACCESS_DENIED handling was only applied to the oauth token store, not the shared credstore backend; both also masked the real lock-contention error behind a misleading "file does not exist" timeout - cli: firstUsableProvider ignored OAuth logins when picking a fallback provider, forcing unnecessary re-onboarding for an already-authenticated user - sessions: exec prompt summarization truncated on a raw byte offset, which could split a multi-byte UTF-8 rune and embed invalid UTF-8 into a resumed session's prompt - tui: configurable keybindings had no collision detection, so a remapped chord could silently make a hardcoded shortcut (or another configured binding) permanently unreachable - tools: the Windows shell-syntax pre-flight check wasn't quote-aware and could block harmless commands whose quoted arguments merely contained operator-like text - update: checksum verification didn't cross-check the archive filename it verified against the one being extracted; extractZip didn't reject symlink-mode entries like extractTarGz already did; replaceBinary's restore-on-failure rename had no retry, risking a permanently missing binary on a transient Windows file lock
WalkthroughThis PR fixes nine bugs from a multi-agent codebase audit: OAuth-aware provider fallback, Windows lock-contention handling in oauth/securefile, case-insensitive sandbox scope matching, UTF-8-safe payload truncation, quote-aware shell command detection, keybinding collision sanitization, and update/self-replace checksum, zip-entry, and rename safety checks. Each fix includes regression tests. ChangesMulti-area audit bug fixes
Estimated code review effort: 3 (Moderate) | ~30 minutes Possibly related PRs
Suggested reviewers: 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/cli/setup.go (1)
335-351: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick winDon't treat store errors as “no OAuth logins.”
oauthLoggedInProviders()returns an empty set on anyNewStore/Statuserror, so a temporary backend failure makes bothsetupRequired()andfirstUsableProvider()fall back as if the user had no OAuth login. Logging that error at debug/warn would distinguish “empty store” from “store unavailable” and avoid silently re-triggering onboarding.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/cli/setup.go` around lines 335 - 351, The oauthLoggedInProviders() helper is swallowing oauth.NewStore and store.Status failures by returning an empty map, which makes setupRequired() and firstUsableProvider() misread backend outages as “no OAuth logins.” Update oauthLoggedInProviders() to log the error with context (debug or warn) before returning, using the oauth.NewStore and Status call sites as the key locations, so callers can distinguish an empty store from an unavailable one.
🧹 Nitpick comments (4)
internal/sessions/exec_session.go (1)
224-241: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winDuplicate of existing rune-boundary truncation helper.
truncateUTF8reimplementscutPromptRuneBoundaryininternal/sessions/replay.go, which does the exact same byte-truncation-with-rune-backoff logic. Reuse the existing helper instead of adding a near-identical one in the same package.♻️ Proposed consolidation
- if len(text) > 500 { - return truncateUTF8(text, 500) - } - return text -} - -// truncateUTF8 returns the longest prefix of s that is at most n bytes, -// backing off to the nearest rune boundary so a multi-byte character isn't -// split — a split rune here would embed invalid UTF-8 into the exec prompt. -func truncateUTF8(s string, n int) string { - if len(s) <= n { - return s - } - for n > 0 && !utf8.RuneStart(s[n]) { - n-- - } - return s[:n] -} + if len(text) > 500 { + return cutPromptRuneBoundary(text, 500) + } + return text +}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/sessions/exec_session.go` around lines 224 - 241, The new truncateUTF8 helper duplicates the existing rune-boundary truncation logic already provided by cutPromptRuneBoundary in the same package. Replace the local truncation logic in exec_session.go with a call to cutPromptRuneBoundary so there is only one implementation of byte truncation with rune backoff. Keep the behavior identical for long and short strings, and remove the redundant truncateUTF8 helper if it is no longer needed.internal/tools/shell_runtime.go (1)
80-104: 🎯 Functional Correctness | 🔵 Trivial | 💤 Low valueBackslash-escaped quotes aren't handled — narrow edge case.
A span like
echo "foo \" ls -la"treats the backslash-escaped\"as a real closing quote, sols -laafter it would be unmasked and could still be (mis)matched. This mirrors typical POSIX escaping, not cmd.exe's""-doubling convention, so real-world impact on Windows commands is low, but worth a mental note if this masking logic gets reused elsewhere.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/tools/shell_runtime.go` around lines 80 - 104, stripQuotedSpans currently treats every quote byte as closing, so escaped quotes inside a quoted span can prematurely end masking. Update stripQuotedSpans in shell_runtime.go to recognize backslash-escaped quotes while inside single/double quotes and keep them masked instead of toggling quote state; preserve the existing behavior for normal quote boundaries and the current length-preserving space replacement.internal/cli/setup_fallback_test.go (1)
72-96: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winStrengthen the regression test with a before/after assertion.
The test only checks the state after seeding the OAuth token. Add an assertion beforehand (e.g.
okis false, or a different provider is chosen) to prove the token — not some unrelated path — is what makesxaiselectable; otherwise this could keep "passing" even if the OAuth-login check regresses.🧪 Proposed strengthening
providers := []config.ProviderProfile{ {Name: "xai", CatalogID: "xai", APIKeyEnv: "XAI_API_KEY"}, // no inline key/env, but logged in via OAuth } + if _, ok := firstUsableProvider(providers); ok { + t.Fatalf("want no usable provider before OAuth login is seeded") + } got, ok := firstUsableProvider(providers) if !ok || got.Name != "xai" { t.Fatalf("want OAuth-logged-in provider (xai), got %q ok=%v", got.Name, ok) }Note: move this check above
store.Save, not after, to test the pre-seed state.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/cli/setup_fallback_test.go` around lines 72 - 96, Strengthen TestFirstUsableProviderRecognizesOAuthLogin by adding a pre-seed assertion before oauth.Store.Save so the test proves the OAuth token changes the outcome. Verify firstUsableProvider with the same providers slice returns no usable provider, or a different result, before seeding the token, then keep the existing post-seed assertion that xai becomes selectable. Use the existing symbols firstUsableProvider, oauth.NewStore, and store.Save to anchor the before/after check.internal/update/replace_windows.go (1)
11-13: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winRetry window may be too short for typical AV/indexer lock durations.
10 attempts × 100ms caps the restore retry at ~1s total. Given the stated motivation (antivirus/indexer scanning holding a transient lock), real-world lock durations can plausibly exceed 1s, which would mean the retry rarely helps in the exact scenario it was built for. Consider a longer total window (e.g., more attempts or a slightly longer/backoff delay) since the cost of retrying is only paid on the already-slow error path.
Also applies to: 39-50
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/update/replace_windows.go` around lines 11 - 13, The restore rename retry window in replace_windows.go is too short for transient AV/indexer locks. Update the retry constants used by the restore rename path (restoreRenameRetryAttempts and restoreRenameRetryDelay) to allow a longer total wait, ideally with a modest backoff or more attempts, and make sure the retry loop that uses these constants in the restore/rename logic covers the intended transient-lock scenario.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@internal/cli/setup.go`:
- Around line 335-351: The oauthLoggedInProviders() helper is swallowing
oauth.NewStore and store.Status failures by returning an empty map, which makes
setupRequired() and firstUsableProvider() misread backend outages as “no OAuth
logins.” Update oauthLoggedInProviders() to log the error with context (debug or
warn) before returning, using the oauth.NewStore and Status call sites as the
key locations, so callers can distinguish an empty store from an unavailable
one.
---
Nitpick comments:
In `@internal/cli/setup_fallback_test.go`:
- Around line 72-96: Strengthen TestFirstUsableProviderRecognizesOAuthLogin by
adding a pre-seed assertion before oauth.Store.Save so the test proves the OAuth
token changes the outcome. Verify firstUsableProvider with the same providers
slice returns no usable provider, or a different result, before seeding the
token, then keep the existing post-seed assertion that xai becomes selectable.
Use the existing symbols firstUsableProvider, oauth.NewStore, and store.Save to
anchor the before/after check.
In `@internal/sessions/exec_session.go`:
- Around line 224-241: The new truncateUTF8 helper duplicates the existing
rune-boundary truncation logic already provided by cutPromptRuneBoundary in the
same package. Replace the local truncation logic in exec_session.go with a call
to cutPromptRuneBoundary so there is only one implementation of byte truncation
with rune backoff. Keep the behavior identical for long and short strings, and
remove the redundant truncateUTF8 helper if it is no longer needed.
In `@internal/tools/shell_runtime.go`:
- Around line 80-104: stripQuotedSpans currently treats every quote byte as
closing, so escaped quotes inside a quoted span can prematurely end masking.
Update stripQuotedSpans in shell_runtime.go to recognize backslash-escaped
quotes while inside single/double quotes and keep them masked instead of
toggling quote state; preserve the existing behavior for normal quote boundaries
and the current length-preserving space replacement.
In `@internal/update/replace_windows.go`:
- Around line 11-13: The restore rename retry window in replace_windows.go is
too short for transient AV/indexer locks. Update the retry constants used by the
restore rename path (restoreRenameRetryAttempts and restoreRenameRetryDelay) to
allow a longer total wait, ideally with a modest backoff or more attempts, and
make sure the retry loop that uses these constants in the restore/rename logic
covers the intended transient-lock scenario.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a5013253-c52d-49b3-a102-974530d4d57d
📒 Files selected for processing (19)
internal/cli/setup.gointernal/cli/setup_fallback_test.gointernal/oauth/encrypt.gointernal/sandbox/grant_scope.gointernal/sandbox/grant_scope_test.gointernal/securefile/securefile.gointernal/sessions/exec_session.gointernal/sessions/store_test.gointernal/tools/bash_tool_test.gointernal/tools/shell_runtime.gointernal/tui/binding_test.gointernal/tui/keybindings.gointernal/tui/model.gointernal/update/apply.gointernal/update/apply_test.gointernal/update/extract.gointernal/update/extract_test.gointernal/update/replace_windows.gointernal/update/replace_windows_test.go
Vasanthdev2004
left a comment
There was a problem hiding this comment.
Approve — all 9 audit fixes verified real and correctly fixed; two notes + a merge-order
I put this through a 9-way independent audit (one reviewer per claimed bug) plus an adversarial verify pass on each, ground-truthing every fix against the pre-fix code on main. All 9 are real defects with correct, net-improving fixes — nothing fabricated, and the skeptic pass agreed keep on all nine. Regression tests present for 8/9 (the securefile one rides the existing concurrency test). Genuinely strong work, @PierrunoYT — thanks.
Highlights I confirmed by tracing the actual code:
- grant-scope case bypass (the important one): real. Pre-fix
grantCoversused raw==/HasPrefix; on case-insensitive Windows a persisted specific-path deny could be slipped by re-casing the path (C:\proj\secret.txtvsc:\…) into a broader allow. The fix routes through the existingfilepath.Rel-basedpathWithinRoot(folds case viaEqualFoldon Windows, byte-identical on Unix) — reuses the codebase's established helper rather than inventing logic. Correct. - securefile lock: real — #445's Windows
ERROR_ACCESS_DENIEDretry was only in the oauth store, not the shared credstore backend. The fix faithfully mirrors #445 (doesn't revert it) and stops masking contention behind a misleading "file does not exist" timeout.
Two non-blocking things worth folding in:
- Reclassify two of the "critical" update fixes.
update-checksum-filenameandupdate-zip-symlinkare real code gaps and the fixes are correct, but they're defense-in-depth, not live vulns — both fail safe onmaintoday (noos.Symlinkis ever materialized,safeExtractPathalready blocks name traversal, extraction is SHA256-gated, and the checksum-filename path force-aborts viaassertSafeChecksumFileName). Worth landing — just soften the "critical/supply-chain" framing so they aren't read as fixing an exploitable hole. - The keybinding-collision fix has a coverage gap (follow-up, not a blocker).
reservedBindingsprotectsctrl+c/esc/enter/shift+tab/ctrl+f/?but omits the bare-chord hardcoded handlers —tab,backspace,up/down,pgup/pgdown. Those are config-settable and matched withMod==0, so binding e.g.toggleDetailed="tab"would still silently shadow Tab navigation (="up"kills history recall) — the exact class this fix closes, uncaught and unwarned. The fix still strictly improves onmainand covers the critical keys, so ship it; just file a follow-up to extend the reserved set.
Merge order — land this after the euxaristia shell cluster (#476 → #468 → #465). #481's shell-preflight-quotes hunk makes detection quote-aware by running the patterns against masked := stripQuotedSpans(trimmed) instead of raw trimmed. Open #476/#468 are actively reworking those very detectors in shell_runtime.go. If #481 rebases last, the masked wrapper cleanly covers whatever final detector set the cluster produces — with the explicit rule that any new detector pattern must match against masked, not trimmed, or quoted content silently gets re-flagged. #481's other 8 fixes are in disjoint files and can land anytime. (If the cluster stalls, land #481 now and make the cluster own the shell_runtime.go reconciliation with that same masked requirement.)
Approving on the merits — CI's green on all 7 checks and every fix holds up.
Summary
Fixes #480. Five parallel review passes swept different areas of the codebase (self-update, TUI keybindings, agent/session handling, sandbox/shell safety, and auth/credential storage) and surfaced nine concrete defects, all fixed here with regression tests.
grant_scopepath comparisons were case-sensitive, letting a persisted deny grant be bypassed by spelling a Windows path with different casecreateSecretFile's WindowsERROR_ACCESS_DENIEDhandling was only applied to the oauth token store, not the shared credstore backend; both also masked the real lock-contention error behind a misleading "file does not exist" timeoutfirstUsableProviderignored OAuth logins when picking a fallback provider, forcing unnecessary re-onboarding for an already-authenticated userextractZipdidn't reject symlink-mode entries likeextractTarGzalready did;replaceBinary's restore-on-failure rename had no retry, risking a permanently missing binary on a transient Windows file lockTest plan
go build ./...andgo vet ./...clean across the reposandbox,oauth,securefile,credstore,cli,sessions,tui,tools,update)tuitest failure (TestProviderWizardAdvancesProviderAPIKeyAndModelSteps) confirmed to fail identically on unmodifiedmainSummary by CodeRabbit