How Does Service Discovery Work in Kubernetes?

Q: How Does Service Discovery Work in Kubernetes?

Kubernetes provides two built-in service discovery mechanisms: DNS-based discovery via CoreDNS (the primary method) and environment variable injection. DNS creates records for every Service, enabling Pods to find other Services by name without hard-coding IP addresses.

Service Discovery in Kubernetes

When you have dozens or hundreds of microservices in a cluster, each needs a way to find and communicate with the others. Kubernetes provides two built-in mechanisms for this: DNS-based discovery and environment variable injection.

DNS-Based Discovery (Primary Method)

CoreDNS is deployed as a cluster add-on and is the default DNS server in Kubernetes. Every Service created in the cluster automatically gets DNS records.

A Records for Services

Every ClusterIP Service gets an A record:

<service-name>.<namespace>.svc.cluster.local  ->  <ClusterIP>

Example:

# From any Pod in the cluster
nslookup payment-service.production.svc.cluster.local

Name:    payment-service.production.svc.cluster.local
Address: 10.96.88.200

Short Names and Search Domains

The Pod's /etc/resolv.conf is configured with search domains, so you do not always need the full FQDN:

# /etc/resolv.conf inside a Pod in the 'production' namespace
nameserver 10.96.0.10
search production.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

This means:

| Query | Resolves To | |---|---| | payment-service | Works within the same namespace | | payment-service.production | Works from any namespace | | payment-service.production.svc.cluster.local | Full FQDN, works everywhere |

SRV Records

Kubernetes also creates SRV records for named ports, following this pattern:

_<port-name>._<protocol>.<service>.<namespace>.svc.cluster.local

Example for a Service with a port named http:

nslookup -type=SRV _http._tcp.payment-service.production.svc.cluster.local

_http._tcp.payment-service.production.svc.cluster.local  SRV  0 100 80 payment-service.production.svc.cluster.local.

SRV records are useful for discovering both the hostname and port number dynamically.

Headless Service DNS

For headless Services (clusterIP: None), DNS returns A records for each individual Pod:

nslookup cassandra.default.svc.cluster.local

Address 1: 10.244.1.5 cassandra-0.cassandra.default.svc.cluster.local
Address 2: 10.244.2.8 cassandra-1.cassandra.default.svc.cluster.local
Address 3: 10.244.3.12 cassandra-2.cassandra.default.svc.cluster.local

ExternalName DNS

ExternalName Services return a CNAME record:

nslookup external-db.production.svc.cluster.local

external-db.production.svc.cluster.local  CNAME  mydb.us-east-1.rds.amazonaws.com

Environment Variable Discovery (Legacy Method)

When a Pod starts, Kubernetes injects environment variables for every Service that exists in the same namespace at that point in time:

# Environment variables for a service named "redis-master" on port 6379
REDIS_MASTER_SERVICE_HOST=10.96.0.11
REDIS_MASTER_SERVICE_PORT=6379
REDIS_MASTER_PORT=tcp://10.96.0.11:6379
REDIS_MASTER_PORT_6379_TCP=tcp://10.96.0.11:6379
REDIS_MASTER_PORT_6379_TCP_PROTO=tcp
REDIS_MASTER_PORT_6379_TCP_PORT=6379
REDIS_MASTER_PORT_6379_TCP_ADDR=10.96.0.11

Limitations of Environment Variables

Order dependency -- The Service must exist before the Pod is created. If the Pod starts first, the variables are not injected.
Static -- The values are set at Pod creation and never updated. If the Service's ClusterIP changes (e.g., after deletion and re-creation), the Pod has stale data.
Namespace-scoped -- Only Services in the same namespace are injected.
Environment pollution -- In clusters with many Services, the number of injected variables can become very large.

You can disable environment variable injection for a Pod:

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  enableServiceLinks: false
  containers:
    - name: app
      image: myapp:1.0

CoreDNS Configuration

CoreDNS configuration is stored in a ConfigMap:

kubectl get configmap coredns -n kube-system -o yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
          lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
          ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
          max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }

Key settings:

| Directive | Purpose | |---|---| | kubernetes | Enables the Kubernetes plugin for Service/Pod DNS resolution | | forward | Forwards non-cluster queries to upstream DNS | | cache 30 | Caches responses for 30 seconds | | loadbalance | Randomizes the order of A records in responses |

Debugging DNS Issues

Step 1: Verify CoreDNS is Running

kubectl get pods -n kube-system -l k8s-app=kube-dns

Step 2: Test from a Debug Pod

apiVersion: v1
kind: Pod
metadata:
  name: dns-debug
spec:
  containers:
    - name: debug
      image: registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3
      command: ["sleep", "3600"]

kubectl exec dns-debug -- nslookup payment-service.production.svc.cluster.local
kubectl exec dns-debug -- nslookup kubernetes.default.svc.cluster.local

Step 3: Check the ndots Setting

The default ndots:5 means any name with fewer than 5 dots triggers the search domain list. This can cause unnecessary DNS queries. For performance-sensitive applications, you can override it:

spec:
  dnsConfig:
    options:
      - name: ndots
        value: "2"

Step 4: Check CoreDNS Logs

kubectl logs -n kube-system -l k8s-app=kube-dns --tail=100

DNS vs. Environment Variables: When to Use Each

| Aspect | DNS | Environment Variables | |---|---|---| | Dynamic updates | Yes | No (static at Pod start) | | Cross-namespace | Yes | No | | Service creation order | Does not matter | Service must exist first | | Port discovery | SRV records | *_SERVICE_PORT vars | | Recommendation | Primary method | Legacy, avoid if possible |

Summary

Kubernetes service discovery is built on CoreDNS, which automatically registers DNS records for every Service. Pods can discover other Services by name without hard-coding IPs. Environment variable injection provides a legacy alternative but is static and limited. Understanding how DNS search domains, SRV records, and CoreDNS configuration work is essential for debugging connectivity issues in production clusters.