What Is Topology-Aware Routing for Kubernetes Services?

advanced|servicessreplatform engineerCKA
TL;DR

Topology-aware routing (formerly Topology-Aware Hints) directs Service traffic to endpoints in the same zone as the client Pod, reducing cross-zone network costs and latency in multi-zone clusters.

Detailed Answer

In multi-zone Kubernetes clusters, Service traffic is distributed randomly across all endpoints by default — even endpoints in different availability zones. Topology-aware routing optimizes this by preferring endpoints in the same zone as the requesting Pod, reducing cross-zone latency and data transfer costs.

Why Cross-Zone Traffic Is a Problem

Cloud providers charge for cross-zone data transfer:

  • AWS: ~$0.01/GB per direction between AZs
  • GCP: ~$0.01/GB between zones in the same region
  • Azure: Metered inter-zone traffic

For a service handling 1TB/day of internal traffic spread across 3 zones, approximately 66% crosses zone boundaries, costing ~$200/month in transfer fees alone.

Enabling Topology-Aware Routing

apiVersion: v1
kind: Service
metadata:
  name: api-server
  annotations:
    service.kubernetes.io/topology-mode: Auto
spec:
  selector:
    app: api
  ports:
    - port: 8080
      targetPort: 8080

When topology-mode: Auto is set, the EndpointSlice controller adds hints to each endpoint indicating which zone's traffic it should receive.

How It Works Under the Hood

  1. The EndpointSlice controller evaluates endpoint distribution across zones
  2. If distribution is sufficiently balanced, it assigns zone hints to endpoints
  3. kube-proxy reads these hints and programs iptables/IPVS rules to prefer same-zone endpoints
  4. Traffic from zone-a clients goes to zone-a endpoints
# Inspect endpoint hints
kubectl get endpointslices -l kubernetes.io/service-name=api-server -o yaml

The EndpointSlice shows hints like:

endpoints:
  - addresses: ["10.1.1.5"]
    zone: "us-east-1a"
    hints:
      forZones:
        - name: "us-east-1a"
  - addresses: ["10.1.2.8"]
    zone: "us-east-1b"
    hints:
      forZones:
        - name: "us-east-1b"

Safeguards and Fallback

Topology-aware routing includes automatic safeguards:

| Condition | Behavior | |-----------|----------| | Endpoints evenly distributed | Hints are applied, zone-local routing is active | | Endpoints severely imbalanced | Hints are removed, traffic is distributed globally | | Fewer than 2 endpoints per zone | Hints are not generated | | An endpoint becomes not-ready | Hints are recalculated to redistribute load |

This prevents a zone with only 1 endpoint from being overwhelmed while zones with 10 endpoints are idle.

Verifying Topology-Aware Routing

# Check if hints are populated
kubectl get endpointslice -l kubernetes.io/service-name=api-server \
  -o jsonpath='{range .items[*].endpoints[*]}{.addresses} -> {.hints.forZones}{"\n"}{end}'

# Check kube-proxy logs for topology-aware behavior
kubectl logs -n kube-system -l k8s-app=kube-proxy | grep -i topology

# Verify zone labels on nodes
kubectl get nodes --label-columns=topology.kubernetes.io/zone

Requirements

For topology-aware routing to work:

  1. Nodes must have zone labels: topology.kubernetes.io/zone (cloud providers add this automatically)
  2. Endpoints must be balanced: Roughly proportional to the number of nodes or CPU in each zone
  3. kube-proxy must be v1.23+: Older versions do not support topology hints
  4. EndpointSlices must be enabled: This is the default since 1.21

Deployment Pattern for Zone Balance

Ensure your Deployments produce balanced endpoints across zones by using topology spread constraints:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 6
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: api
      containers:
        - name: api
          image: api-server:1.0
          resources:
            requests:
              cpu: "250m"
              memory: "256Mi"

Topology-Aware Routing vs. Service Mesh

| Feature | Topology-Aware Routing | Service Mesh (Istio) | |---------|----------------------|---------------------| | Setup complexity | Single annotation | Full mesh installation | | Traffic control | Zone preference only | Zone, locality, failover | | Weighted failover | No — all-or-nothing | Yes — configurable priorities | | Observability | None built-in | Full traffic metrics |

For simple zone-local routing, the built-in topology-aware routing is sufficient. For advanced locality-aware load balancing with failover policies, a service mesh provides more control.

Why Interviewers Ask This

Cross-zone traffic is a significant cost driver in cloud environments. This question tests your understanding of how to optimize network traffic patterns for both cost and performance.

Common Follow-Up Questions

How do you enable topology-aware routing?
Add the annotation service.kubernetes.io/topology-mode: Auto to the Service. The EndpointSlice controller then populates topology hints on each endpoint.
When does topology-aware routing not work?
It is disabled when endpoints are unevenly distributed across zones, when a zone has too few endpoints to handle its share of traffic, or when there are fewer than 2 endpoints per zone.
What replaced the deprecated topologyKeys field on Services?
The topologyKeys field was removed in 1.22. Topology-aware hints (annotation-based) replaced it and was renamed to topology-aware routing in 1.27.

Key Takeaways

  • Topology-aware routing keeps traffic within the same zone, reducing cross-zone data transfer costs.
  • It requires evenly distributed endpoints across zones to function correctly.
  • Enable it with a simple annotation — no service mesh or CNI changes required.

Related Questions

You Might Also Like