How Do You Debug DNS Issues in Kubernetes?

Q: How Do You Debug DNS Issues in Kubernetes?

Debug Kubernetes DNS by checking CoreDNS Pod health, verifying resolv.conf configuration, testing lookups from within Pods using nslookup or dig, inspecting CoreDNS logs, and validating that the kube-dns Service and endpoints exist. Common issues include CoreDNS crashes, misconfigured network policies blocking DNS, and ndots settings causing slow lookups.

Detailed Answer

DNS issues are among the most common problems in Kubernetes clusters. A systematic debugging approach can resolve most issues quickly.

Step 1: Check CoreDNS Health

# Are CoreDNS Pods running?
kubectl get pods -n kube-system -l k8s-app=kube-dns
# NAME                       READY   STATUS    RESTARTS   AGE
# coredns-5d78c9869d-abc12   1/1     Running   0          5d
# coredns-5d78c9869d-def34   1/1     Running   0          5d

# Check for crashes or restarts
kubectl describe pod -n kube-system -l k8s-app=kube-dns

# View CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50

Common CoreDNS issues:

CrashLoopBackOff: Usually a Corefile syntax error
OOMKilled: Increase memory limits
High restart count: Check for forwarding loops

Step 2: Verify the kube-dns Service

# Check the Service exists and has endpoints
kubectl get service kube-dns -n kube-system
# NAME       TYPE        CLUSTER-IP   PORT(S)
# kube-dns   ClusterIP   10.96.0.10   53/UDP,53/TCP,9153/TCP

kubectl get endpoints kube-dns -n kube-system
# NAME       ENDPOINTS
# kube-dns   10.244.0.5:53,10.244.0.6:53  ← Must have endpoints

If endpoints are empty, CoreDNS Pods are not Running/Ready.

Step 3: Test DNS from a Pod

# Launch a debug Pod
kubectl run dns-debug --rm -it --image=busybox:1.36 -- sh

# Test cluster DNS
nslookup kubernetes.default.svc.cluster.local
# Server:    10.96.0.10
# Address:   10.96.0.10:53
# Name:      kubernetes.default.svc.cluster.local
# Address:   10.96.0.1

# Test a Service in your namespace
nslookup api-service.default.svc.cluster.local

# Test external DNS
nslookup google.com

# Use dig for more detail
kubectl run dns-debug --rm -it --image=tutum/dnsutils -- dig api-service.default.svc.cluster.local

Step 4: Check Pod resolv.conf

kubectl exec my-pod -- cat /etc/resolv.conf
# nameserver 10.96.0.10
# search default.svc.cluster.local svc.cluster.local cluster.local
# options ndots:5

Verify:

nameserver matches the kube-dns Service ClusterIP
search includes your namespace
ndots is set (default 5)

Common Issues and Solutions

Issue: CoreDNS Forwarding Loop

Symptom: CoreDNS repeatedly crashes with "Loop detected" in logs.

Cause: CoreDNS forwards to itself via the node's resolv.conf.

Fix: Configure forward to use explicit upstream DNS:

forward . 8.8.8.8 8.8.4.4 {
    max_concurrent 1000
}

Issue: Network Policy Blocking DNS

Symptom: Pods cannot resolve any DNS names after applying network policies.

Fix: Allow egress to CoreDNS:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
spec:
  podSelector: {}
  policyTypes:
    - Egress
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
      ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53

Issue: Slow External DNS Resolution

Symptom: External names like api.stripe.com take several seconds to resolve.

Cause: ndots:5 causes multiple failed lookups with search domains before the real lookup.

Fix: Use FQDNs with trailing dots or reduce ndots:

spec:
  dnsConfig:
    options:
      - name: ndots
        value: "2"

Issue: Service Not Resolving

Symptom: nslookup api-service returns NXDOMAIN.

Debugging steps:

# 1. Does the Service exist?
kubectl get service api-service
# If not found, that's the problem

# 2. Try the fully qualified name
nslookup api-service.default.svc.cluster.local
# If this works but the short name doesn't, check search domains

# 3. Is the Service in a different namespace?
nslookup api-service.backend.svc.cluster.local

# 4. Does the Service have endpoints?
kubectl get endpoints api-service
# If empty, no Pods match the Service selector

Issue: CoreDNS Out of Memory

Symptom: CoreDNS Pods are OOMKilled.

Fix: Increase memory limits:

kubectl edit deployment coredns -n kube-system
# Increase resources.limits.memory

Or deploy NodeLocal DNSCache to reduce load on CoreDNS:

kubectl apply -f nodelocaldns.yaml

DNS Debugging Toolkit

# Quick DNS test
kubectl run dns-test --rm -it --image=busybox -- nslookup kubernetes

# Full DNS debugging toolkit
kubectl run dns-debug --rm -it --image=nicolaka/netshoot -- bash
# Then use: dig, nslookup, host, drill

# Check CoreDNS metrics
kubectl port-forward -n kube-system svc/kube-dns 9153:9153 &
curl -s http://localhost:9153/metrics | grep coredns_dns_responses_total