How Does Pod Topology Spread Work in Kubernetes?

Q: How Does Pod Topology Spread Work in Kubernetes?

Pod topology spread constraints distribute Pods evenly across topology domains (nodes, zones, regions) based on a maxSkew value, giving you finer control over Pod distribution than anti-affinity alone.

Detailed Answer

Topology spread constraints were introduced in Kubernetes 1.19 as a general-purpose mechanism to distribute Pods evenly across configurable topology domains. They solve a problem that pod anti-affinity handles only partially: ensuring balanced distribution rather than just separation.

The Problem with Anti-Affinity Alone

Pod anti-affinity with topologyKey: topology.kubernetes.io/zone prevents two Pods from landing in the same zone, but it does not balance them. With 6 replicas and 3 zones, anti-affinity alone could place 4 Pods in zone-a and 1 each in zone-b and zone-c. Topology spread constraints enforce a maximum imbalance.

Basic Topology Spread Constraint

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
spec:
  replicas: 6
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: web
      containers:
        - name: nginx
          image: nginx:1.27
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"

This ensures that across all zones, the difference in web Pod count is at most 1. With 3 zones and 6 replicas, each zone gets exactly 2 Pods.

Key Fields Explained

| Field | Purpose | |-------|---------| | maxSkew | Maximum difference in Pod count between any two topology domains | | topologyKey | Node label that defines topology domains (zone, hostname, region) | | whenUnsatisfiable | DoNotSchedule (hard) or ScheduleAnyway (soft) | | labelSelector | Selects which Pods to count for skew calculation | | matchLabelKeys | (v1.27+) Uses Pod label values to scope the constraint per rollout | | minDomains | (v1.25+) Minimum number of domains required before enforcing the constraint |

Multiple Constraints

You can apply multiple topology spread constraints simultaneously — for example, spread across both zones and nodes:

spec:
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app: web
    - maxSkew: 1
      topologyKey: kubernetes.io/hostname
      whenUnsatisfiable: ScheduleAnyway
      labelSelector:
        matchLabels:
          app: web

The first constraint is hard (zone-level balance is mandatory), while the second is soft (node-level balance is best-effort).

Cluster-Level Defaults

Starting in Kubernetes 1.24, you can configure default topology spread constraints via the kube-scheduler --config file:

apiVersion: kubescheduler.config.k8s.io/v1
kind: KubeSchedulerConfiguration
profiles:
  - pluginConfig:
      - name: PodTopologySpread
        args:
          defaultConstraints:
            - maxSkew: 1
              topologyKey: topology.kubernetes.io/zone
              whenUnsatisfiable: ScheduleAnyway
          defaultingType: List

This applies to all Pods that do not define their own constraints, providing a safety net across the cluster.

matchLabelKeys for Rolling Updates

During a rolling update, old and new ReplicaSets co-exist. Without matchLabelKeys, the constraint counts both old and new Pods together, which can cause imbalance. Setting matchLabelKeys: ["pod-template-hash"] scopes counting to only the Pods from the same ReplicaSet:

topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        app: web
    matchLabelKeys:
      - pod-template-hash

Debugging Topology Spread

When Pods are stuck in Pending, check scheduler events:

kubectl describe pod <pod-name> | grep -A 10 Events
# Look for: "doesn't satisfy spread constraint"

# Check current distribution
kubectl get pods -l app=web -o wide --sort-by='.spec.nodeName'

When to Use Topology Spread vs. Anti-Affinity

| Scenario | Best Tool | |----------|-----------| | No two replicas on the same node | Pod anti-affinity | | Even distribution across zones | Topology spread constraints | | Co-locate with specific Pods | Pod affinity | | Combination of balance and separation | Both constraints together |

Detailed Answer

The Problem with Anti-Affinity Alone

Basic Topology Spread Constraint

Key Fields Explained

Multiple Constraints

Cluster-Level Defaults

matchLabelKeys for Rolling Updates

Debugging Topology Spread

When to Use Topology Spread vs. Anti-Affinity

Why Interviewers Ask This

Common Follow-Up Questions

Key Takeaways

Related Questions

You Might Also Like