You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
k8s/bases/infrastructure/vault-backup/cronjob.yaml (CronJob vault-snapshot, nightly 03:30) does not take a snapshot β it authenticates, runs bao status, and exits (a seal-state health check). But:
docs/dr/openbao-raft-ha-migration.md step 1 says "rely on the existing vault-backup CronJob's latest snapshot" β output that doesn't exist.
The OpenBao data PVCs sit on hcloud block storage, which has no CSI snapshot support, so Velero falls back to file-system backup of a live, open raft database β crash-consistent at best, not a supported OpenBao restore path.
Rotated DB credentials and tenant-pushed app secrets are not reproducible from vault-seed (the runbook itself notes this), so a real snapshot is the only clean restore for them.
Once the 3-node Raft cutover completes, bao operator raft snapshot save is available (consistent, online, supported restore via raft snapshot restore). Rework the CronJob to:
bao operator raft snapshot save /backup/openbao-$(date).snap against openbao-active (needs a token/policy with sys/storage/raft/snapshot read β the existing vault-snapshot k8s-auth role can be extended in vault-config),
persist it where Velero's nightly run picks it up (small PVC with N-day retention), or push directly to R2,
Gap
k8s/bases/infrastructure/vault-backup/cronjob.yaml(CronJobvault-snapshot, nightly 03:30) does not take a snapshot β it authenticates, runsbao status, and exits (a seal-state health check). But:docs/dr/openbao-raft-ha-migration.mdstep 1 says "rely on the existing vault-backup CronJob's latest snapshot" β output that doesn't exist.hcloudblock storage, which has no CSI snapshot support, so Velero falls back to file-system backup of a live, open raft database β crash-consistent at best, not a supported OpenBao restore path.Proposal (post raft cutover, #1907)
Once the 3-node Raft cutover completes,
bao operator raft snapshot saveis available (consistent, online, supported restore viaraft snapshot restore). Rework the CronJob to:bao operator raft snapshot save /backup/openbao-$(date).snapagainstopenbao-active(needs a token/policy withsys/storage/raft/snapshotread β the existingvault-snapshotk8s-auth role can be extended in vault-config),openbao-activeso it stops failing spuriously on sealed standbys).Sequencing: blocked on the raft cutover completing (openbao-0 currently still runs the legacy file backend, where no raft snapshot API exists).