You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/book/src/examples/cni-readiness.md
+36-37Lines changed: 36 additions & 37 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ This guide demonstrates how to use the Node Readiness Controller to prevent pods
9
9
10
10
The high-level steps are:
11
11
1. Node is bootstrapped with a [startup taint](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/)`readiness.k8s.io/NetworkReady=pending:NoSchedule` immediately upon joining.
12
-
2. A sidecar is patched to the cni-agent to monitor the CNI's health and report it to the API server as node-condition (`network.k8s.io/CalicoReady`).
12
+
2. A reporter DaemonSet is deployed to monitor the CNI's health and report it to the API server as node-condition (`projectcalico.org/CalicoReady`).
13
13
3. Node Readiness Controller will untaint the node only when the CNI reports it is ready.
14
14
15
15
## Step-by-Step Guide
@@ -20,43 +20,42 @@ This example uses **Calico**, but the pattern applies to any CNI.
20
20
21
21
### 1. Deploy the Readiness Condition Reporter
22
22
23
-
We need to bridge Calico's internal health status to a Kubernetes Node Condition. We will add a **sidecar container**to the Calico DaemonSet.
23
+
We need to bridge Calico's internal health status to a Kubernetes Node Condition. We will deploy a **reporter DaemonSet**that runs on every node.
24
24
25
-
This sidecar checks Calico's local health endpoint (`http://localhost:9099/readiness`) and updates a node condition `network.k8s.io/CalicoReady`.
25
+
This reporter checks Calico's local health endpoint (`http://localhost:9099/readiness`) and updates a node condition `projectcalico.org/CalicoReady`.
26
26
27
-
**Patch your Calico DaemonSet:**
27
+
Using a separate DaemonSet instead of a sidecar ensures that readiness reporting works even if the CNI pod is crashlooping or failing to start containers.
> Note: In this example, the CNI pod health is monitored by a side-car, so watcher's lifecycle is same as the pod lifecycle.
55
-
If the Calico pod is crashlooping, the sidecar will not run and cannot report readiness. For robust 'continuous' readiness reporting, the watcher should be 'external' to the pod.
56
-
57
56
### 2. Grant Permissions (RBAC)
58
57
59
-
The sidecar needs permission to update the Node object's status.
58
+
The reporter needs permission to update the Node object's status.
60
59
61
60
```yaml
62
61
# calico-rbac-node-status-patch-role.yaml
@@ -78,15 +77,15 @@ roleRef:
78
77
kind: ClusterRole
79
78
name: node-status-patch-role
80
79
subjects:
81
-
# Bind to CNI's ServiceAccount
80
+
# Bind to CNI Reporter's ServiceAccount
82
81
- kind: ServiceAccount
83
-
name: calico-node
82
+
name: cni-reporter
84
83
namespace: kube-system
85
84
```
86
85
87
86
### 3. Create the Node Readiness Rule
88
87
89
-
Now define the rule that enforces the requirement. This tells the controller: *"Keep the `readiness.k8s.io/NetworkReady` taint on the node until `network.k8s.io/CalicoReady` is True."*
88
+
Now define the rule that enforces the requirement. This tells the controller: *"Keep the `readiness.k8s.io/NetworkReady` taint on the node until `projectcalico.org/CalicoReady` is True."*
90
89
91
90
```yaml
92
91
# network-readiness-rule.yaml
@@ -97,7 +96,7 @@ metadata:
97
96
spec:
98
97
# The condition(s) to monitor
99
98
conditions:
100
-
- type: "network.k8s.io/CalicoReady"
99
+
- type: "projectcalico.org/CalicoReady"
101
100
requiredStatus: "True"
102
101
103
102
# The taint to manage
@@ -139,8 +138,8 @@ To test this, add a new node to the cluster.
This example demonstrates how to use the Node Readiness Controller to ensure nodes are only marked ready for workloads after the CNI (Calico) has fully initialized.
4
+
5
+
### How it works:
6
+
1. Nodes join with a `readiness.k8s.io/NetworkReady=pending:NoSchedule` taint.
7
+
2. A lightweight DaemonSet (`cni-reporter-ds.yaml`)
8
+
monitors Calico's health endpoint (`localhost:9099/readiness`) and updates a
9
+
node condition `projectcalico.org/CalicoReady`.
10
+
3. The `NodeReadinessRule` (`network-readiness-rule.yaml`) instructs the controller to remove the startup taint once the `projectcalico.org/CalicoReady` condition becomes `True`.
11
+
4. The reporter is deployed with `hostNetwork: true` to reach Calico's local health endpoint.
12
+
5. The reporter needs a dedicated ServiceAccount (`cni-reporter`) with permissions to patch node status.
0 commit comments