Skip to content

Cilium 1.20: evaluate datapath performance knobs alongside the upgrade (bandwidthManager+BBR, bpf.masquerade; netkit later)Β #2029

@devantler

Description

@devantler

πŸ€– Generated by the Daily AI Assistant

Context β€” why these are deferred to the 1.20 upgrade

A 2026-06-11 performance survey of the platform concluded the Cilium datapath is not the measured bottleneck today: CPU throttling is ~0% across workloads, user-facing slowness traced to scheduling/capacity (autoscaler thrash, memory-bound control planes), not packet processing. Meanwhile the cilium HelmRelease sets rollOutCiliumPods: true, so every values change rolls every agent, on a cluster whose worst incidents were all datapath regressions (nodeEncryption black-hole β†’ #1975, strict-mode drops β†’ #1944, SPIRE wedges β†’ #1809/#1818, firewall 4250 β†’ #1859).

The Cilium 1.20 upgrade is already planned (gate for replacing auth-proxy with the Gateway API ExternalAuth filter, GEP-1494 / cilium/cilium#45739). That upgrade rolls all agents anyway β€” the right moment to pay the roll cost for datapath knobs once instead of per-change.

Current baseline (k8s/bases/infrastructure/controllers/cilium/helm-release.yaml + hetzner patch): tunnel routing (vxlan), kubeProxyReplacement: true, WireGuard pod-to-pod with strict egress mode (nodeEncryption: false, deliberate), SPIRE mutual auth enforced, Hubble + relay (2) + UI (KEDA 0/1), Gateway API + ALPN, agents request-only (no CPU limits β€” per Cilium guidance), monitor aggregation at chart default.

Recommended β€” adopt with/after 1.20, one knob per PR

1. bandwidthManager: { enabled: true, bbr: true }

  • Cilium's own performance-tuning recommendation: fq/EDT pacing + BBR congestion control improves tail latency and throughput fairness on egress (gateway responses toward Cloudflare).
  • Kernel requirement (β‰₯ 5.18 for BBR) comfortably met by the Talos 6.x kernel.
  • Caveat: replaces the node qdisc β€” soak in local/CI with WireGuard enabled before prod.

2. bpf.masquerade: true

  • Moves podβ†’external SNAT from iptables to eBPF, removing per-packet iptables traversal on all egress (Cloudflare-bound traffic, registry pulls, webhooks).
  • Requires kube-proxy replacement (already on).
  • Caveat: verify the WireGuard interplay in local/CI first; changes the NAT path for all egress.

Rollout discipline for both: one knob per PR; local/CI soak first; low-traffic window; after each node rolls, check cilium status --verbose and watch Hubble for unexpected drops. rollOutCiliumPods: true means each PR is a full one-node-at-a-time agent roll.

Revisit later (not at 1.20)

  • netkit device mode β€” the largest upstream datapath win (replaces veth), but still bleeding-edge for this cluster's risk tolerance; re-evaluate once 1.20.x has matured and after the two knobs above have soaked.

Evaluated and rejected (rationale recorded so it isn't re-litigated)

Knob Verdict Why
routingMode: native / autoDirectNodeRoutes ❌ Needs L2 adjacency or route programming Hetzner's private network doesn't naturally provide; high effort/risk, modest gain at this scale.
DSR ❌ Requires switching the tunnel to Geneve, and most user traffic is L7-proxied through Envoy/Gateway anyway β€” DSR's benefit barely applies.
CiliumEndpointSlice ⏸️ Cuts apiserver watch load (fits the memory-bound CPs) but the benefit scales with endpoint count β€” small at a few hundred pods β€” and it changes ipcache propagation timing on a cluster sensitized to exactly that.
Disabling Hubble ❌ Would free the 2 relay pods + agent flow processing, but Hubble is the only visibility into Cilium policy verdicts/drops (Coroot cannot see CNI drops), and drop-debugging is this cluster's recurring incident mode.
Monitor aggregation tuning βœ… already fine Chart default (medium) is the sane setting.
Agent resource changes βœ… already done Requests right-sized in #1713; limits deliberately absent per Cilium guidance.

πŸ€– Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions