How Do Pod Affinity and Anti-Affinity Work?
Pod affinity schedules Pods near other Pods that match a label selector, while pod anti-affinity ensures Pods are spread apart. Both operate within a topology domain (node, zone, rack) and support required (hard) and preferred (soft) rules. Anti-affinity is commonly used to spread replicas across failure domains.
Detailed Answer
Pod Affinity: Co-locating Pods
Pod affinity attracts a Pod to nodes that already run Pods matching a specific label selector, within a defined topology domain. This is useful for placing related services close together to reduce network latency.
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-frontend
spec:
replicas: 3
selector:
matchLabels:
app: web-frontend
template:
metadata:
labels:
app: web-frontend
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- cache
topologyKey: kubernetes.io/hostname
containers:
- name: web
image: web-app:latest
This ensures every web-frontend Pod runs on a node that also has a Pod labeled app=cache. The topologyKey: kubernetes.io/hostname scopes the affinity to the individual node level.
Pod Anti-Affinity: Spreading Pods Apart
Pod anti-affinity ensures Pods are not co-located with other Pods matching a selector. This is the standard pattern for high availability.
Spread Replicas Across Nodes
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
spec:
replicas: 3
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- redis
topologyKey: kubernetes.io/hostname
containers:
- name: redis
image: redis:7
This guarantees that no two Redis Pods run on the same node. If there are only 2 nodes and 3 replicas, the third replica stays Pending.
Spread Replicas Across Zones
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- api-gateway
topologyKey: topology.kubernetes.io/zone
This ensures each api-gateway replica is in a different availability zone, surviving a single zone failure.
Soft Anti-Affinity (Preferred)
When strict spreading is not possible (e.g., more replicas than zones), use preferred anti-affinity:
apiVersion: apps/v1
kind: Deployment
metadata:
name: worker
spec:
replicas: 10
selector:
matchLabels:
app: worker
template:
metadata:
labels:
app: worker
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- worker
topologyKey: kubernetes.io/hostname
containers:
- name: worker
image: worker:latest
The scheduler will try to spread workers across nodes but will allow multiple Pods per node if necessary.
Understanding topologyKey
The topologyKey is a node label that defines the scope of the affinity/anti-affinity rule:
| topologyKey | Scope | Use Case |
|---|---|---|
| kubernetes.io/hostname | Per node | Spread across individual nodes |
| topology.kubernetes.io/zone | Per AZ | Survive AZ failure |
| topology.kubernetes.io/region | Per region | Survive regional failure |
| kubernetes.io/os | Per OS | Separate Linux/Windows |
| Custom label (e.g., rack) | Per rack | Spread across racks |
# View topology labels on nodes
kubectl get nodes -L topology.kubernetes.io/zone,kubernetes.io/hostname
Combining Affinity and Anti-Affinity
A common pattern: co-locate frontend with cache (affinity) while spreading frontend replicas across zones (anti-affinity):
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
spec:
replicas: 3
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
spec:
affinity:
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- memcached
topologyKey: kubernetes.io/hostname
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- frontend
topologyKey: topology.kubernetes.io/zone
containers:
- name: frontend
image: frontend:v2
Namespace Considerations
By default, pod affinity/anti-affinity only considers Pods in the same namespace as the Pod being scheduled. To match Pods in other namespaces:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- database
topologyKey: kubernetes.io/hostname
namespaces:
- database-namespace
# Or use namespaceSelector to match by namespace labels:
# namespaceSelector:
# matchLabels:
# team: backend
Performance Considerations
Pod affinity and anti-affinity rules require the scheduler to evaluate all existing Pods that match the label selector in the relevant namespaces. In clusters with thousands of Pods, this can significantly slow scheduling. Best practices:
- Keep label selectors narrow to reduce the number of Pods evaluated.
- Prefer
preferredDuringSchedulingoverrequiredDuringSchedulingwhen possible. - Consider using topology spread constraints instead of anti-affinity for even distribution, as they are more efficient for the scheduler.
Debugging Scheduling Failures
# Check why a Pod is not being scheduled
kubectl describe pod frontend-abc123 | grep -A 10 Events
# Common messages:
# "didn't match pod affinity rules"
# "didn't match pod anti-affinity rules"
# "node(s) didn't match pod topology spread constraints"
Why Interviewers Ask This
Interviewers test whether you can design highly available deployments that spread replicas across zones and co-locate related services for performance.
Common Follow-Up Questions
Key Takeaways
- Pod affinity co-locates related Pods; anti-affinity separates them
- The topologyKey defines the failure domain scope (node, zone, region)
- Required anti-affinity with zone topologyKey is the standard HA pattern