From 7c12c95a4f705fb26ceb86a90d1c3a2e10d3cf73 Mon Sep 17 00:00:00 2001 From: Nikolai Emil Damm Date: Sat, 20 Jun 2026 09:25:13 +0200 Subject: [PATCH] fix(longhorn): exempt longhorn-system from generated LimitRange/quota The add-ns-quota Kyverno policy generates a LimitRange into every namespace that stamps a default memory limit (512Mi) onto any container that omits one. longhorn-manager creates instance-manager pods without a memory limit by design; the injected 512Mi cap OOM-kills them during volume rebuilds (idle RSS already sits at ~70% of the cap). When an instance-manager dies, longhorn-manager deletes+recreates the pod, which faults every replica engine on that node (DetachedUnexpectedly) and cascades into CNPG primary failover and Postgres timeline divergence across coroot-db / umami-db / wedding-db, wedging the infrastructure and apps Flux Kustomizations cluster-wide (observed 2026-06-20). Exempt longhorn-system from both the LimitRange rule (drops the OOM-inducing limit) and the ResourceQuota rule (pods that no longer receive the LimitRange-supplied requests would otherwise be rejected by the requests.memory quota), treating it as platform infra like flux-system. Longhorn's components are sized by their own VPAs/operator. Co-Authored-By: Claude Opus 4.8 --- .../cluster-policies/kustomization.yaml | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/k8s/bases/infrastructure/cluster-policies/kustomization.yaml b/k8s/bases/infrastructure/cluster-policies/kustomization.yaml index ae95aecdc..4040dbb27 100644 --- a/k8s/bases/infrastructure/cluster-policies/kustomization.yaml +++ b/k8s/bases/infrastructure/cluster-policies/kustomization.yaml @@ -52,6 +52,20 @@ patches: # platform-owned namespaces; its sizing is VPA-owned, not # quota-bounded. The LimitRange (rule 1) still applies. - observability + # Longhorn storage data plane. longhorn-manager creates the + # instance-manager pods WITHOUT a memory limit by design: during + # a volume rebuild they burst well past the generic LimitRange + # default (memory 512Mi), get OOMKilled, and longhorn-manager + # then deletes+recreates the IM pod -- which faults EVERY replica + # engine on that node (DetachedUnexpectedly), cascading into + # CNPG primary failover and Postgres timeline divergence + # cluster-wide (observed 2026-06-20). It is excluded from BOTH + # rules: rule 1 drops the OOM-inducing limit, and this rule + # (ResourceQuota) must drop too -- otherwise pods that no longer + # receive the LimitRange-supplied requests get rejected by the + # requests.memory quota. Longhorn sizing is operator-/VPA-owned + # (manager, csi, ui carry VPAs), not quota-bounded. + - longhorn-system - op: add path: /spec/rules/0/generate/generateExisting value: true @@ -79,6 +93,11 @@ patches: - kube-public - kube-node-lease - flux-system + # See rule 0: Longhorn instance-managers must run without the + # default memory limit -- an OOM during a rebuild recreates the + # IM pod and faults every volume on the node, taking the storage + # data plane (and every CNPG database on it) down. + - longhorn-system - op: add path: /spec/rules/1/generate/generateExisting value: true