How Do You Debug DNS Issues in Kubernetes?
Debug Kubernetes DNS by checking CoreDNS Pod health, verifying resolv.conf configuration, testing lookups from within Pods using nslookup or dig, inspecting CoreDNS logs, and validating that the kube-dns Service and endpoints exist. Common issues include CoreDNS crashes, misconfigured network policies blocking DNS, and ndots settings causing slow lookups.
Detailed Answer
DNS issues are among the most common problems in Kubernetes clusters. A systematic debugging approach can resolve most issues quickly.
Step 1: Check CoreDNS Health
# Are CoreDNS Pods running?
kubectl get pods -n kube-system -l k8s-app=kube-dns
# NAME READY STATUS RESTARTS AGE
# coredns-5d78c9869d-abc12 1/1 Running 0 5d
# coredns-5d78c9869d-def34 1/1 Running 0 5d
# Check for crashes or restarts
kubectl describe pod -n kube-system -l k8s-app=kube-dns
# View CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50
Common CoreDNS issues:
- CrashLoopBackOff: Usually a Corefile syntax error
- OOMKilled: Increase memory limits
- High restart count: Check for forwarding loops
Step 2: Verify the kube-dns Service
# Check the Service exists and has endpoints
kubectl get service kube-dns -n kube-system
# NAME TYPE CLUSTER-IP PORT(S)
# kube-dns ClusterIP 10.96.0.10 53/UDP,53/TCP,9153/TCP
kubectl get endpoints kube-dns -n kube-system
# NAME ENDPOINTS
# kube-dns 10.244.0.5:53,10.244.0.6:53 ← Must have endpoints
If endpoints are empty, CoreDNS Pods are not Running/Ready.
Step 3: Test DNS from a Pod
# Launch a debug Pod
kubectl run dns-debug --rm -it --image=busybox:1.36 -- sh
# Test cluster DNS
nslookup kubernetes.default.svc.cluster.local
# Server: 10.96.0.10
# Address: 10.96.0.10:53
# Name: kubernetes.default.svc.cluster.local
# Address: 10.96.0.1
# Test a Service in your namespace
nslookup api-service.default.svc.cluster.local
# Test external DNS
nslookup google.com
# Use dig for more detail
kubectl run dns-debug --rm -it --image=tutum/dnsutils -- dig api-service.default.svc.cluster.local
Step 4: Check Pod resolv.conf
kubectl exec my-pod -- cat /etc/resolv.conf
# nameserver 10.96.0.10
# search default.svc.cluster.local svc.cluster.local cluster.local
# options ndots:5
Verify:
nameservermatches the kube-dns Service ClusterIPsearchincludes your namespacendotsis set (default 5)
Common Issues and Solutions
Issue: CoreDNS Forwarding Loop
Symptom: CoreDNS repeatedly crashes with "Loop detected" in logs.
Cause: CoreDNS forwards to itself via the node's resolv.conf.
Fix: Configure forward to use explicit upstream DNS:
forward . 8.8.8.8 8.8.4.4 {
max_concurrent 1000
}
Issue: Network Policy Blocking DNS
Symptom: Pods cannot resolve any DNS names after applying network policies.
Fix: Allow egress to CoreDNS:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
Issue: Slow External DNS Resolution
Symptom: External names like api.stripe.com take several seconds to resolve.
Cause: ndots:5 causes multiple failed lookups with search domains before the real lookup.
Fix: Use FQDNs with trailing dots or reduce ndots:
spec:
dnsConfig:
options:
- name: ndots
value: "2"
Issue: Service Not Resolving
Symptom: nslookup api-service returns NXDOMAIN.
Debugging steps:
# 1. Does the Service exist?
kubectl get service api-service
# If not found, that's the problem
# 2. Try the fully qualified name
nslookup api-service.default.svc.cluster.local
# If this works but the short name doesn't, check search domains
# 3. Is the Service in a different namespace?
nslookup api-service.backend.svc.cluster.local
# 4. Does the Service have endpoints?
kubectl get endpoints api-service
# If empty, no Pods match the Service selector
Issue: CoreDNS Out of Memory
Symptom: CoreDNS Pods are OOMKilled.
Fix: Increase memory limits:
kubectl edit deployment coredns -n kube-system
# Increase resources.limits.memory
Or deploy NodeLocal DNSCache to reduce load on CoreDNS:
kubectl apply -f nodelocaldns.yaml
DNS Debugging Toolkit
# Quick DNS test
kubectl run dns-test --rm -it --image=busybox -- nslookup kubernetes
# Full DNS debugging toolkit
kubectl run dns-debug --rm -it --image=nicolaka/netshoot -- bash
# Then use: dig, nslookup, host, drill
# Check CoreDNS metrics
kubectl port-forward -n kube-system svc/kube-dns 9153:9153 &
curl -s http://localhost:9153/metrics | grep coredns_dns_responses_total
Why Interviewers Ask This
Interviewers ask this because DNS issues are one of the most common sources of connectivity problems in Kubernetes, and debugging them efficiently is a critical operational skill.
Common Follow-Up Questions
Key Takeaways
- Always check CoreDNS Pod health first — most DNS issues stem from unhealthy CoreDNS.
- Use nslookup or dig from within a Pod to test DNS resolution from the Pod's perspective.
- Network policies blocking UDP 53 to kube-system are a common cause of DNS failures.