What happened:
After upgrading (from a very old version) external-dns with the azure-private-dns provider and --source=pod, we observed a large number of Private DNS zones ending up with stale CNAME records with empty targets, which then caused a permanent error loop when the same pods were later assigned a PodIP.
When a pod carries the external-dns.alpha.kubernetes.io/internal-hostname annotation but is still in Pending state (not yet scheduled, no PodIP assigned), external-dns creates a CNAME record with an empty target ("") instead of skipping the pod.
In source/pod.go, addInternalHostnameAnnotationEndpoints calls endpoint.SuitableType(pod.Status.PodIP) without first checking whether PodIP is empty. Since "" is not a valid IP address, SuitableType returns CNAME, and an endpoint with an empty CNAME target is created.
Once the pod gets scheduled and receives an IP, external-dns tries to create an A record at the same DNS name. The provider rejects this because A and CNAME records cannot coexist at the same name — Azure returns 409 Conflict with code CannotCreateRecordDueToCNameNamingRestriction. The error repeats on every reconciliation cycle (every 3 minutes) and the A record is never created until the stale CNAME record is manually deleted from the DNS zone.
The same bug exists in two other code paths in source/pod.go that also pass pod.Status.PodIP to SuitableType without an empty-IP guard:
addKopsDNSControllerEndpoints (kops-dns-controller compatibility mode)
addPodSourceDomainEndpoints (--pod-source-domain)
What you expected to happen:
Pods with an empty PodIP should be skipped when generating endpoints, consistent with the existing guard in hostsFromTemplate (source/pod.go line 261-264):
if address.IP == "" {
log.Debugf("skipping pod %q. PodIP is empty with phase %q", pod.Name, pod.Status.Phase)
continue
}
No CNAME record with an empty target should ever be created.
DNS records — actual vs. expected:
For a pod foo with annotation external-dns.alpha.kubernetes.io/internal-hostname: foo.example.internal:
Actual (buggy) behavior: while the pod is Pending, external-dns writes:
foo.example.internal. CNAME ""
cname-foo.example.internal. TXT "heritage=external-dns,external-dns/owner=example-cluster,external-dns/resource=pod/..."
Once the pod reaches Running with PodIP 10.0.0.10, external-dns tries to add foo.example.internal. A 10.0.0.10 but the empty-target CNAME at the same name blocks it, and the A record is never created.
Expected behavior — nothing is written while the pod is Pending. Once the pod reaches Running with PodIP 10.0.0.10:
foo.example.internal. A 10.0.0.10
a-foo.example.internal. TXT "heritage=external-dns,external-dns/owner=example-cluster,external-dns/resource=pod/..."
No CNAME record is ever created.
How to reproduce it (as minimally and precisely as possible):
- Configure external-dns with
--source=pod and an Azure Private DNS provider
- Create a pod with the annotation
external-dns.alpha.kubernetes.io/internal-hostname: foo.example.com
- Ensure the pod stays in
Pending state (e.g. unschedulable due to resource requests or node selector)
- Observe external-dns creating a CNAME record with an empty target for
foo.example.com
- Let the pod become
Running (normal scheduling once resources free up)
- Observe the permanent
409 Conflict error loop on every reconciliation cycle
We have occurence of this issue on ephemeral envs were pods with internal Hostname can spawn but stay a few minutes in Pending while new nodes are created.
Another way is to add these tests:
Adding the following table entries to TestPodSource in source/pod_test.go (each asserts that an empty endpoint list is produced) reproduces the bug on master — all three fail with expected 0 endpoints, got 1:
{
"pending pod with empty PodIP and internal-hostname annotation should not create CNAME",
"",
"",
false,
"",
[]*endpoint.Endpoint{},
false,
nil,
[]*corev1.Pod{
{
ObjectMeta: metav1.ObjectMeta{
Name: "pending-pod",
Namespace: "kube-system",
Annotations: map[string]string{
annotations.InternalHostnameKey: "foo.example.com",
},
},
Spec: corev1.PodSpec{
HostNetwork: false,
},
Status: corev1.PodStatus{
Phase: corev1.PodPending,
PodIP: "",
},
},
},
},
{
"pending pod with empty PodIP and pod-source-domain should not create CNAME",
"",
"",
false,
"example.org",
[]*endpoint.Endpoint{},
false,
nil,
[]*corev1.Pod{
{
ObjectMeta: metav1.ObjectMeta{
Name: "pending-pod",
Namespace: "kube-system",
},
Spec: corev1.PodSpec{HostNetwork: false},
Status: corev1.PodStatus{
Phase: corev1.PodPending,
PodIP: "",
},
},
},
},
{
"pending pod with empty PodIP and kops-dns-controller annotation should not create CNAME",
"",
"kops-dns-controller",
false,
"",
[]*endpoint.Endpoint{},
false,
nil,
[]*corev1.Pod{
{
ObjectMeta: metav1.ObjectMeta{
Name: "pending-pod",
Namespace: "kube-system",
Annotations: map[string]string{
kopsDNSControllerInternalHostnameAnnotationKey: "foo.example.com",
},
},
Spec: corev1.PodSpec{HostNetwork: false},
Status: corev1.PodStatus{
Phase: corev1.PodPending,
PodIP: "",
},
},
},
},
Run with:
go test ./source/ -run "TestPodSource/pending_pod" -v
Output on current master:
--- FAIL: TestPodSource (0.34s)
--- FAIL: TestPodSource/pending_pod_with_empty_PodIP_and_internal-hostname_annotation_should_not_create_CNAME (0.13s)
pod_test.go:861: expected 0 endpoints, got 1
--- FAIL: TestPodSource/pending_pod_with_empty_PodIP_and_pod-source-domain_should_not_create_CNAME (0.10s)
pod_test.go:861: expected 0 endpoints, got 1
--- FAIL: TestPodSource/pending_pod_with_empty_PodIP_and_kops-dns-controller_annotation_should_not_create_CNAME (0.10s)
pod_test.go:861: expected 0 endpoints, got 1
Each "got 1" endpoint is a CNAME with an empty target, confirming the bug.
External-dns deployment (live object from the API server, trimmed for brevity):
apiVersion: v1
kind: Pod
metadata:
name: external-dns-xxxxxxxxxx-xxxxx
namespace: external-dns
spec:
containers:
- args:
- --log-level=info
- --log-format=json
- --interval=3m
- --source=service
- --source=pod
- --policy=sync
- --registry=txt
- --txt-owner-id=example-cluster
- --domain-filter=example.internal
- --managed-record-types=A
- --provider=azure-private-dns
image: registry.k8s.io/external-dns/external-dns:v0.20.0 # note that we are using latest version available through chart, but bug still present in latest version
name: external-dns
status:
phase: Running
podIP: 10.0.0.1
Logs (excerpt):
{"level":"info","msg":"Updating CNAME record named 'foo-1.primary' to '' for Azure Private DNS zone 'example.internal'."}
{"level":"info","msg":"Updating TXT record named 'cname-foo-1.primary' to '\"heritage=external-dns,...\"' for Azure Private DNS zone 'example.internal'."}
{"level":"info","msg":"Updating A record named 'foo-1.primary' to '10.0.0.10' for Azure Private DNS zone 'example.internal'.","time":"2026-04-16T16:42:39Z"}
{"level":"error","msg":"Failed to update A record named 'foo-1.primary' to '10.0.0.10' for Azure Private DNS zone 'example.internal': PUT https://management.azure.com/.../privateDnsZones/example.internal/A/foo-1.primary\nRESPONSE 409: 409 Conflict\nERROR CODE: Conflict\n{\n \"code\": \"Conflict\",\n \"message\": \"The record could not be created because a CNAME record with the same name already exists in this zone.\",\n \"details\": [\n {\n \"code\": \"CannotCreateRecordDueToCNameNamingRestriction\",\n \"message\": \"The record could not be created because a CNAME record with the same name already exists in this zone.\"\n }\n ]\n}\n","time":"2026-04-16T16:42:39Z"}
Then the same pattern Updating A + Failed to update A record repeats for ~150 distinct record names every reconciliation cycle
Note: --managed-record-types=A is our current workaround, it prevents external-dns from managing CNAMEs at all, so no new empty-target CNAMEs get created. The stale CNAMEs from before the workaround still exist in the zone and continue to conflict with A record creation until they are manually deleted.
Anything else we need to know?:
Possible fix: add the same empty-PodIP guard that hostsFromTemplate already uses, before each of the three SuitableType(pod.Status.PodIP) call sites in source/pod.go:
if pod.Status.PodIP == "" {
log.Debugf("skipping pod %q: PodIP is empty with phase %q", pod.Name, pod.Status.Phase)
continue // or return, depending on the enclosing control flow
}
I have a PR adding this check, waiting a bit to confirm the issue is valid.
Current workaround: --managed-record-types=A stops external-dns from creating or updating CNAME records, which prevents new empty-target CNAMEs from being created. Stale CNAMEs from before the workaround must still be manually deleted from the zone.
Also, full disclosure, I did use AI to confirm this bug and draft this issue. I did not find specific guideline forbidding it, and hoped to have injected enough of my human brain so that it is somewhat intelligible.
See also: #5277 (thematically related — SuitableType returning CNAME inappropriately).
Environment:
- External-DNS version (use
external-dns --version): v0.20.0 (but given a quick analysis of the code should still be present on latest version)
- DNS provider: azure-private-dns
- Kubernetes: AKS 1.33
- Source: pod
- Scale: Affect every "long pending" pods, in our case +150 pods but depends on the env
Checklist
What happened:
After upgrading (from a very old version) external-dns with the
azure-private-dnsprovider and--source=pod, we observed a large number of Private DNS zones ending up with stale CNAME records with empty targets, which then caused a permanent error loop when the same pods were later assigned a PodIP.When a pod carries the
external-dns.alpha.kubernetes.io/internal-hostnameannotation but is still inPendingstate (not yet scheduled, no PodIP assigned), external-dns creates a CNAME record with an empty target ("") instead of skipping the pod.In
source/pod.go,addInternalHostnameAnnotationEndpointscallsendpoint.SuitableType(pod.Status.PodIP)without first checking whetherPodIPis empty. Since""is not a valid IP address,SuitableTypereturnsCNAME, and an endpoint with an empty CNAME target is created.Once the pod gets scheduled and receives an IP, external-dns tries to create an A record at the same DNS name. The provider rejects this because A and CNAME records cannot coexist at the same name — Azure returns
409 Conflictwith codeCannotCreateRecordDueToCNameNamingRestriction. The error repeats on every reconciliation cycle (every 3 minutes) and the A record is never created until the stale CNAME record is manually deleted from the DNS zone.The same bug exists in two other code paths in
source/pod.gothat also passpod.Status.PodIPtoSuitableTypewithout an empty-IP guard:addKopsDNSControllerEndpoints(kops-dns-controller compatibility mode)addPodSourceDomainEndpoints(--pod-source-domain)What you expected to happen:
Pods with an empty
PodIPshould be skipped when generating endpoints, consistent with the existing guard inhostsFromTemplate(source/pod.goline 261-264):No CNAME record with an empty target should ever be created.
DNS records — actual vs. expected:
For a pod
foowith annotationexternal-dns.alpha.kubernetes.io/internal-hostname: foo.example.internal:Actual (buggy) behavior: while the pod is
Pending, external-dns writes:Once the pod reaches
Runningwith PodIP10.0.0.10, external-dns tries to addfoo.example.internal. A 10.0.0.10but the empty-target CNAME at the same name blocks it, and the A record is never created.Expected behavior — nothing is written while the pod is
Pending. Once the pod reachesRunningwith PodIP10.0.0.10:No CNAME record is ever created.
How to reproduce it (as minimally and precisely as possible):
--source=podand an Azure Private DNS providerexternal-dns.alpha.kubernetes.io/internal-hostname: foo.example.comPendingstate (e.g. unschedulable due to resource requests or node selector)foo.example.comRunning(normal scheduling once resources free up)409 Conflicterror loop on every reconciliation cycleWe have occurence of this issue on ephemeral envs were pods with internal Hostname can spawn but stay a few minutes in Pending while new nodes are created.
Another way is to add these tests:
Adding the following table entries to
TestPodSourceinsource/pod_test.go(each asserts that an empty endpoint list is produced) reproduces the bug onmaster— all three fail withexpected 0 endpoints, got 1:{ "pending pod with empty PodIP and internal-hostname annotation should not create CNAME", "", "", false, "", []*endpoint.Endpoint{}, false, nil, []*corev1.Pod{ { ObjectMeta: metav1.ObjectMeta{ Name: "pending-pod", Namespace: "kube-system", Annotations: map[string]string{ annotations.InternalHostnameKey: "foo.example.com", }, }, Spec: corev1.PodSpec{ HostNetwork: false, }, Status: corev1.PodStatus{ Phase: corev1.PodPending, PodIP: "", }, }, }, }, { "pending pod with empty PodIP and pod-source-domain should not create CNAME", "", "", false, "example.org", []*endpoint.Endpoint{}, false, nil, []*corev1.Pod{ { ObjectMeta: metav1.ObjectMeta{ Name: "pending-pod", Namespace: "kube-system", }, Spec: corev1.PodSpec{HostNetwork: false}, Status: corev1.PodStatus{ Phase: corev1.PodPending, PodIP: "", }, }, }, }, { "pending pod with empty PodIP and kops-dns-controller annotation should not create CNAME", "", "kops-dns-controller", false, "", []*endpoint.Endpoint{}, false, nil, []*corev1.Pod{ { ObjectMeta: metav1.ObjectMeta{ Name: "pending-pod", Namespace: "kube-system", Annotations: map[string]string{ kopsDNSControllerInternalHostnameAnnotationKey: "foo.example.com", }, }, Spec: corev1.PodSpec{HostNetwork: false}, Status: corev1.PodStatus{ Phase: corev1.PodPending, PodIP: "", }, }, }, },Run with:
Output on current
master:Each "got 1" endpoint is a CNAME with an empty target, confirming the bug.
External-dns deployment (live object from the API server, trimmed for brevity):
Logs (excerpt):
Then the same pattern
Updating A+Failed to update A recordrepeats for ~150 distinct record names every reconciliation cycleNote:
--managed-record-types=Ais our current workaround, it prevents external-dns from managing CNAMEs at all, so no new empty-target CNAMEs get created. The stale CNAMEs from before the workaround still exist in the zone and continue to conflict with A record creation until they are manually deleted.Anything else we need to know?:
Possible fix: add the same empty-PodIP guard that
hostsFromTemplatealready uses, before each of the threeSuitableType(pod.Status.PodIP)call sites insource/pod.go:I have a PR adding this check, waiting a bit to confirm the issue is valid.
Current workaround:
--managed-record-types=Astops external-dns from creating or updating CNAME records, which prevents new empty-target CNAMEs from being created. Stale CNAMEs from before the workaround must still be manually deleted from the zone.Also, full disclosure, I did use AI to confirm this bug and draft this issue. I did not find specific guideline forbidding it, and hoped to have injected enough of my human brain so that it is somewhat intelligible.
See also: #5277 (thematically related —
SuitableTypereturningCNAMEinappropriately).Environment:
external-dns --version): v0.20.0 (but given a quick analysis of the code should still be present on latest version)Checklist
or have checked the staging image to confirm the bug is still reproducible
kubectl get <resource> -o yamloutput includingstatus