How Does Service Discovery Work in Kubernetes?
Kubernetes provides two built-in service discovery mechanisms: DNS-based discovery via CoreDNS (the primary method) and environment variable injection. DNS creates records for every Service, enabling Pods to find other Services by name without hard-coding IP addresses.
Service Discovery in Kubernetes
When you have dozens or hundreds of microservices in a cluster, each needs a way to find and communicate with the others. Kubernetes provides two built-in mechanisms for this: DNS-based discovery and environment variable injection.
DNS-Based Discovery (Primary Method)
CoreDNS is deployed as a cluster add-on and is the default DNS server in Kubernetes. Every Service created in the cluster automatically gets DNS records.
A Records for Services
Every ClusterIP Service gets an A record:
<service-name>.<namespace>.svc.cluster.local -> <ClusterIP>
Example:
# From any Pod in the cluster
nslookup payment-service.production.svc.cluster.local
Name: payment-service.production.svc.cluster.local
Address: 10.96.88.200
Short Names and Search Domains
The Pod's /etc/resolv.conf is configured with search domains, so you do not always need the full FQDN:
# /etc/resolv.conf inside a Pod in the 'production' namespace
nameserver 10.96.0.10
search production.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
This means:
| Query | Resolves To |
|---|---|
| payment-service | Works within the same namespace |
| payment-service.production | Works from any namespace |
| payment-service.production.svc.cluster.local | Full FQDN, works everywhere |
SRV Records
Kubernetes also creates SRV records for named ports, following this pattern:
_<port-name>._<protocol>.<service>.<namespace>.svc.cluster.local
Example for a Service with a port named http:
nslookup -type=SRV _http._tcp.payment-service.production.svc.cluster.local
_http._tcp.payment-service.production.svc.cluster.local SRV 0 100 80 payment-service.production.svc.cluster.local.
SRV records are useful for discovering both the hostname and port number dynamically.
Headless Service DNS
For headless Services (clusterIP: None), DNS returns A records for each individual Pod:
nslookup cassandra.default.svc.cluster.local
Address 1: 10.244.1.5 cassandra-0.cassandra.default.svc.cluster.local
Address 2: 10.244.2.8 cassandra-1.cassandra.default.svc.cluster.local
Address 3: 10.244.3.12 cassandra-2.cassandra.default.svc.cluster.local
ExternalName DNS
ExternalName Services return a CNAME record:
nslookup external-db.production.svc.cluster.local
external-db.production.svc.cluster.local CNAME mydb.us-east-1.rds.amazonaws.com
Environment Variable Discovery (Legacy Method)
When a Pod starts, Kubernetes injects environment variables for every Service that exists in the same namespace at that point in time:
# Environment variables for a service named "redis-master" on port 6379
REDIS_MASTER_SERVICE_HOST=10.96.0.11
REDIS_MASTER_SERVICE_PORT=6379
REDIS_MASTER_PORT=tcp://10.96.0.11:6379
REDIS_MASTER_PORT_6379_TCP=tcp://10.96.0.11:6379
REDIS_MASTER_PORT_6379_TCP_PROTO=tcp
REDIS_MASTER_PORT_6379_TCP_PORT=6379
REDIS_MASTER_PORT_6379_TCP_ADDR=10.96.0.11
Limitations of Environment Variables
- Order dependency -- The Service must exist before the Pod is created. If the Pod starts first, the variables are not injected.
- Static -- The values are set at Pod creation and never updated. If the Service's ClusterIP changes (e.g., after deletion and re-creation), the Pod has stale data.
- Namespace-scoped -- Only Services in the same namespace are injected.
- Environment pollution -- In clusters with many Services, the number of injected variables can become very large.
You can disable environment variable injection for a Pod:
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
enableServiceLinks: false
containers:
- name: app
image: myapp:1.0
CoreDNS Configuration
CoreDNS configuration is stored in a ConfigMap:
kubectl get configmap coredns -n kube-system -o yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
Key settings:
| Directive | Purpose |
|---|---|
| kubernetes | Enables the Kubernetes plugin for Service/Pod DNS resolution |
| forward | Forwards non-cluster queries to upstream DNS |
| cache 30 | Caches responses for 30 seconds |
| loadbalance | Randomizes the order of A records in responses |
Debugging DNS Issues
Step 1: Verify CoreDNS is Running
kubectl get pods -n kube-system -l k8s-app=kube-dns
Step 2: Test from a Debug Pod
apiVersion: v1
kind: Pod
metadata:
name: dns-debug
spec:
containers:
- name: debug
image: registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3
command: ["sleep", "3600"]
kubectl exec dns-debug -- nslookup payment-service.production.svc.cluster.local
kubectl exec dns-debug -- nslookup kubernetes.default.svc.cluster.local
Step 3: Check the ndots Setting
The default ndots:5 means any name with fewer than 5 dots triggers the search domain list. This can cause unnecessary DNS queries. For performance-sensitive applications, you can override it:
spec:
dnsConfig:
options:
- name: ndots
value: "2"
Step 4: Check CoreDNS Logs
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=100
DNS vs. Environment Variables: When to Use Each
| Aspect | DNS | Environment Variables |
|---|---|---|
| Dynamic updates | Yes | No (static at Pod start) |
| Cross-namespace | Yes | No |
| Service creation order | Does not matter | Service must exist first |
| Port discovery | SRV records | *_SERVICE_PORT vars |
| Recommendation | Primary method | Legacy, avoid if possible |
Summary
Kubernetes service discovery is built on CoreDNS, which automatically registers DNS records for every Service. Pods can discover other Services by name without hard-coding IPs. Environment variable injection provides a legacy alternative but is static and limited. Understanding how DNS search domains, SRV records, and CoreDNS configuration work is essential for debugging connectivity issues in production clusters.
Why Interviewers Ask This
Service discovery is fundamental to microservice architecture. Interviewers want to see that candidates understand how Pods find each other and can troubleshoot DNS-related issues in production.
Common Follow-Up Questions
Key Takeaways
- DNS-based discovery via CoreDNS is the primary and recommended service discovery mechanism.
- Every Service gets A records, SRV records, and optionally CNAME records registered automatically.
- Environment variables provide a legacy alternative but are static and order-dependent.