Skip to content

feat(backups): surface repo maintenance + alert on failed runs#285

Merged
passcod merged 3 commits into
mainfrom
backup-maintenance-surfacing
Jun 26, 2026
Merged

feat(backups): surface repo maintenance + alert on failed runs#285
passcod merged 3 commits into
mainfrom
backup-maintenance-surfacing

Conversation

@passcod

@passcod passcod commented Jun 26, 2026

Copy link
Copy Markdown
Member

🤖 The group backup page tracked maintenance runs in the DB and returned them from the stats endpoint as recent_maintenance, but never rendered them — and nothing alerted when a maintenance run failed (only the absence-of-success staleness check existed).

This adds both halves.

UI — "Repo maintenance" panel

A new panel on the group backup page with:

  • An at-a-glance health summary: Healthy / Last run failed / Running, plus when maintenance last completed successfully (or that it never has).
  • A table of recent kopia maintenance cycles (started, kind, outcome, finished, reclaimed bytes), mirroring the recent-runs panel. Failed runs expand to their error.

This directly answers "has maintenance/expiry run, and did it succeed?", which previously had no UI indication at all.

Alerting — backup-maintenance-error

A new group-scoped issue raised when the most recently finished maintenance run failed, cleared by a later successful run. Error severity, so it opens an incident and pages — consistent with the existing backup-maintenance-stale check. In-flight runs (no outcome yet) are ignored, so a started-but-unfinished run never clears an open failure.

This complements backup-maintenance-stale (no successful maintenance for 8 days), which catches a different failure mode: maintenance not running at all vs. running but erroring every time.

Per the discussion, the existing 8-day staleness threshold is left untouched.

Linear: TAM-6877

passcod and others added 3 commits June 27, 2026 05:55
Adds a "Repo maintenance" panel to the group backup page: an
at-a-glance health summary (last successful maintenance / failed /
running) plus a table of recent kopia maintenance cycles, mirroring the
recent-runs panel. recent_maintenance was already returned by the stats
endpoint but never rendered.

Adds a group-level backup-maintenance-error incident (Error, paging)
raised when the most recently *finished* maintenance run failed, cleared
by a later successful run. This complements the existing
backup-maintenance-stale absence-of-success check, which catches a
different failure mode (maintenance not running at all).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
getByText("success"/"running") also matched the "Last successful
maintenance" caption / the "Running" summary chip. Use exact matches.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Snapshot expiry (kopia snapshot expire --delete) runs on every maintenance
cycle, not just full; full additionally reclaims the freed space.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@passcod passcod enabled auto-merge June 26, 2026 18:44
@passcod passcod added this pull request to the merge queue Jun 26, 2026
Merged via the queue into main with commit 2a0dc79 Jun 26, 2026
7 checks passed
@passcod passcod deleted the backup-maintenance-surfacing branch June 26, 2026 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant