Taints and Tolerations vs Node Affinity
Key Differences in Kubernetes
Taints repel Pods from nodes — a node with a taint rejects all Pods that do not tolerate it. Node affinity attracts Pods to nodes — a Pod with node affinity is scheduled on nodes that match its label selector. Taints work from the node's perspective (push), while affinity works from the Pod's perspective (pull). Use taints to reserve nodes for specific workloads; use affinity to guide Pods toward preferred nodes.
Side-by-Side Comparison
| Dimension | Taints and Tolerations | Node Affinity |
|---|---|---|
| Direction | Node pushes away Pods that lack the matching toleration | Pod pulls itself toward nodes with matching labels |
| Default Behavior | Rejects all Pods unless they have a toleration | Does nothing unless the Pod specifies affinity rules |
| Configured On | Taints on nodes, tolerations on Pods | Labels on nodes, affinity rules on Pods |
| Enforcement | NoSchedule (hard), PreferNoSchedule (soft), NoExecute (evict) | requiredDuringScheduling (hard), preferredDuringScheduling (soft) |
| Scope | Broad — affects all Pods that lack the toleration | Specific — only affects Pods that declare affinity |
| Use Case | Reserve nodes for GPU workloads, dedicated tenants, or system components | Place Pods on specific hardware (SSD, GPU), zones, or regions |
Detailed Breakdown
Taints and Tolerations
A taint is applied to a node:
# Add a taint to a node
kubectl taint nodes gpu-node-1 dedicated=gpu:NoSchedule
This means: no Pod can be scheduled on gpu-node-1 unless it tolerates dedicated=gpu.
A toleration is added to a Pod:
apiVersion: v1
kind: Pod
metadata:
name: ml-training
spec:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
containers:
- name: training
image: ml-trainer:1.0.0
resources:
limits:
nvidia.com/gpu: 1
This Pod tolerates the dedicated=gpu:NoSchedule taint and can be scheduled on gpu-node-1. But note: the toleration only allows scheduling on tainted nodes — it does not force the Pod onto those nodes. Without node affinity, this Pod could still land on a non-GPU node.
Taint Effects
# NoSchedule — new Pods rejected, existing Pods stay
kubectl taint nodes node-1 maintenance=true:NoSchedule
# PreferNoSchedule — soft preference to avoid this node
kubectl taint nodes node-1 preference=spot:PreferNoSchedule
# NoExecute — new Pods rejected AND existing Pods evicted
kubectl taint nodes node-1 maintenance=true:NoExecute
NoExecute is powerful for node maintenance. When applied, all Pods without a matching toleration are evicted immediately. Pods with a toleration can optionally set tolerationSeconds to delay eviction:
tolerations:
- key: "maintenance"
operator: "Equal"
value: "true"
effect: "NoExecute"
tolerationSeconds: 300 # Stay for 5 minutes, then evict
Node Affinity
Node affinity is declared in the Pod spec and uses node labels for matching:
# Label a node
kubectl label nodes node-1 disktype=ssd zone=us-east-1a
apiVersion: v1
kind: Pod
metadata:
name: database
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disktype
operator: In
values:
- ssd
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80
preference:
matchExpressions:
- key: zone
operator: In
values:
- us-east-1a
containers:
- name: postgres
image: postgres:16
This Pod:
- Must be scheduled on a node with
disktype=ssd(hard requirement) - Prefers nodes in
zone=us-east-1awith weight 80 (soft preference)
If no SSD nodes are available, the Pod stays in Pending. If SSD nodes exist but none are in us-east-1a, the Pod is still scheduled on an SSD node in another zone.
nodeSelector — The Simple Alternative
For simple cases, nodeSelector provides a straightforward hard requirement:
spec:
nodeSelector:
disktype: ssd
This is equivalent to a requiredDuringSchedulingIgnoredDuringExecution node affinity but with simpler syntax. Node affinity adds operators (In, NotIn, Exists, DoesNotExist, Gt, Lt) and soft preferences that nodeSelector cannot express.
Using Taints and Affinity Together
The best practice for dedicated nodes uses both mechanisms:
# Step 1: Taint GPU nodes to repel non-GPU workloads
kubectl taint nodes gpu-node-1 dedicated=gpu:NoSchedule
kubectl taint nodes gpu-node-2 dedicated=gpu:NoSchedule
# Step 2: Label GPU nodes for affinity targeting
kubectl label nodes gpu-node-1 hardware=gpu
kubectl label nodes gpu-node-2 hardware=gpu
# Step 3: GPU workload with both toleration and affinity
apiVersion: apps/v1
kind: Deployment
metadata:
name: ml-training
spec:
replicas: 2
selector:
matchLabels:
app: ml-training
template:
metadata:
labels:
app: ml-training
spec:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: hardware
operator: In
values:
- gpu
containers:
- name: trainer
image: ml-trainer:1.0.0
resources:
limits:
nvidia.com/gpu: 1
The taint prevents non-GPU workloads from using GPU nodes (push). The affinity ensures GPU workloads are placed on GPU nodes (pull). Together, they create a dedicated GPU node pool.
Pod Anti-Affinity
While node affinity places Pods on specific nodes, pod anti-affinity keeps Pods away from each other:
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- web
topologyKey: kubernetes.io/hostname
This ensures no two Pods with app=web are placed on the same node — useful for high availability.
System Taints
Kubernetes automatically applies taints to nodes in certain conditions:
node.kubernetes.io/not-ready— node is not readynode.kubernetes.io/unreachable— node is unreachable from the controllernode.kubernetes.io/memory-pressure— node has memory pressurenode.kubernetes.io/disk-pressure— node has disk pressurenode.kubernetes.io/unschedulable— node is cordoned
DaemonSets automatically add tolerations for these system taints, which is why system Pods (kube-proxy, CNI) continue running on problematic nodes.
Topology Spread Constraints
For more advanced scheduling, topology spread constraints combine aspects of both affinity and anti-affinity:
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: web
This ensures Pods are evenly distributed across availability zones, with at most 1 Pod difference between zones. This is more flexible than pod anti-affinity for multi-zone high availability.
Decision Guide
- "Keep non-X workloads off these nodes" → Use taints
- "Put this workload on nodes with label X" → Use node affinity
- "Dedicate nodes exclusively to X workloads" → Use both taints and affinity
- "Spread Pods across zones/nodes" → Use pod anti-affinity or topology spread constraints
- "Prefer this node type but don't require it" → Use
preferredDuringSchedulingaffinity orPreferNoScheduletaint
Use Taints and Tolerations when...
- •You want to dedicate nodes to specific workloads (GPU, high-memory)
- •You need to keep regular workloads off control-plane nodes
- •You want to evict Pods from a node being drained for maintenance
- •You need to isolate tenants on dedicated node pools
- •You want to gradually drain Pods from a node (NoExecute with tolerationSeconds)
Use Node Affinity when...
- •You want Pods to land on nodes with specific hardware or labels
- •You need to spread workloads across availability zones
- •You want to prefer certain nodes without hard requirements
- •You need topology-aware scheduling for data locality
- •You want to co-locate workloads on the same node type
Model Interview Answer
“Taints and node affinity both influence Pod scheduling but from opposite directions. Taints are set on nodes and repel Pods — a tainted node rejects any Pod that does not have a matching toleration. This is used to reserve nodes, like keeping GPU nodes for ML workloads only. Node affinity is set on Pods and attracts them to nodes — a Pod with node affinity is scheduled on nodes whose labels match its selector. A taint says 'stay away unless you tolerate me,' while affinity says 'I want to run on a node like this.' They are often used together: taint GPU nodes to prevent general workloads, and add node affinity to ML Pods to target those GPU nodes.”