fix(alertd): report unrunnable doctor checks as broken#595
Merged
Conversation
A failed `podman ps` means the check could not run, so it says nothing about the system. That is a broken check, not a warning (which implies a degraded system). Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
caddy exiting non-zero or emitting unparseable output means the check couldn't determine a version, so it says nothing about the system. That's a broken check, not a skip (which implies a precondition like caddy not being installed). Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
A non-zero `df` exit means inode usage couldn't be read, so the check couldn't run and says nothing about the system: broken, not skip (skip stays for df not being on PATH). Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
When kopia exits cleanly but emits output we can't parse, the check couldn't run and says nothing about backups: broken, not skip. The non-zero-exit arms stay skip because on Linux they legitimately catch sudo/elevation denial, a privilege precondition. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
A row that comes back but won't decode is a schema/check mismatch, not a system fault — mirror how query_error_check treats 42xxx schema errors and report broken rather than failed. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
A version() row that won't decode as text is a check mismatch, not a database fault — mirror query_error_check's handling of 42xxx schema errors and report broken rather than failed. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Enumeration failures (who/quser ran but errored) previously collapsed into the same skip as the platform-unsupported fallback, hiding a check that couldn't run behind a precondition skip. Split CollectOutcome into Failed (broken: the tool ran but we got no answer) and Unsupported (skip: no enumeration on this platform). Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🤖 Several doctor checks reported "I couldn't actually run this check" as a
warning,fail, orskip, which either dressed up a blind check as a degraded system or hid it behind a precondition skip. A check that couldn't run says nothing about the system under test — that's what thebrokenstatus is for (the same statusquery_error_checkalready uses for42xxxschema errors).Each commit reclassifies one check:
podman psfailing was awarning; nowbroken.skip; nowbroken.skipstays for caddy not being installed.dfexiting non-zero was askip; nowbroken.skipstays fordfnot on PATH.skip; nowbroken. The non-zero-exit arms stayskipbecause on Linux they legitimately catch sudo/elevation denial, a privilege precondition.fail, nowbroken.who/quserran but errored) collapsed into the sameskipas the platform-unsupported fallback.CollectOutcomeis split intoFailed(broken) andUnsupported(skip).