Kubernetes Taint Not Tolerated
Causes and Fixes
A 'taint not tolerated' scheduling failure occurs when all nodes in the cluster have taints that the pod does not tolerate. Taints are applied to nodes to repel pods that lack corresponding tolerations. Without a matching toleration, the scheduler excludes the node, and if all nodes are tainted, the pod remains in Pending state.
Symptoms
- Pod stuck in Pending state
- FailedScheduling event shows 'node(s) had taint ... that the pod didn't tolerate'
- Pod cannot schedule even though nodes have available resources
- Newly added nodes have unexpected taints preventing scheduling
- Pods evicted after a node gains a condition-based taint
Common Causes
Step-by-Step Troubleshooting
Taint-based scheduling failures are straightforward once you understand which taints exist and which tolerations the pod needs. This guide walks through identifying the taint mismatch and resolving it.
1. Identify the Taint From the Error Message
The FailedScheduling event tells you exactly which taint is blocking.
kubectl describe pod <pod-name>
The event message will include something like:
0/3 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/control-plane: }, that the pod didn't tolerate.
or
0/5 nodes are available: 2 node(s) had taint {dedicated: gpu}, that the pod didn't tolerate, 3 Insufficient cpu.
Note the taint key, value, and how many nodes it affects.
2. List All Node Taints
Get a comprehensive view of all taints across the cluster.
# List all taints on all nodes
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
# More readable format
kubectl describe nodes | grep -E "^Name:|^Taints:"
# Check a specific node
kubectl describe node <node-name> | grep -A5 "Taints:"
3. Check the Pod's Current Tolerations
See what tolerations the pod already has.
kubectl get pod <pod-name> -o jsonpath='{.spec.tolerations}' | jq .
# If the pod is from a Deployment, check the template
kubectl get deployment <deployment-name> -o jsonpath='{.spec.template.spec.tolerations}' | jq .
Kubernetes automatically adds certain tolerations to every pod:
node.kubernetes.io/not-ready:NoExecute(tolerationSeconds: 300)node.kubernetes.io/unreachable:NoExecute(tolerationSeconds: 300)
4. Add the Required Toleration
Once you know the taint key, value, and effect, add the matching toleration to the pod spec.
# For a Deployment
kubectl patch deployment <deployment-name> -p '{
"spec": {
"template": {
"spec": {
"tolerations": [
{
"key": "<taint-key>",
"operator": "Equal",
"value": "<taint-value>",
"effect": "NoSchedule"
}
]
}
}
}
}'
Toleration operators:
- Equal: The key and value must match exactly. Requires both
keyandvaluefields. - Exists: Only the key must match. The
valuefield is omitted.
# Tolerate any value for a key
kubectl patch deployment <deployment-name> -p '{
"spec": {
"template": {
"spec": {
"tolerations": [
{
"key": "<taint-key>",
"operator": "Exists",
"effect": "NoSchedule"
}
]
}
}
}
}'
# Tolerate all taints (use with extreme caution)
kubectl patch deployment <deployment-name> -p '{
"spec": {
"template": {
"spec": {
"tolerations": [
{
"operator": "Exists"
}
]
}
}
}
}'
5. Remove the Taint (If Inappropriate)
If the taint should not be on the node, remove it.
# Remove a specific taint (note the trailing dash)
kubectl taint node <node-name> <key>=<value>:<effect>-
# Examples:
kubectl taint node worker-1 dedicated=gpu:NoSchedule-
kubectl taint node control-plane-1 node-role.kubernetes.io/control-plane:NoSchedule-
# Verify the taint was removed
kubectl describe node <node-name> | grep Taints
6. Handle Automatic Condition Taints
The kubelet automatically taints nodes under certain conditions.
# Common automatic taints:
# node.kubernetes.io/not-ready:NoSchedule
# node.kubernetes.io/unreachable:NoSchedule
# node.kubernetes.io/memory-pressure:NoSchedule
# node.kubernetes.io/disk-pressure:NoSchedule
# node.kubernetes.io/pid-pressure:NoSchedule
# node.kubernetes.io/network-unavailable:NoSchedule
# node.kubernetes.io/unschedulable:NoSchedule (from kubectl cordon)
# Check node conditions to understand why auto-taints were applied
kubectl describe node <node-name> | grep -A20 "Conditions:"
These taints are managed by the kubelet and will be removed automatically when the condition clears. Do not add tolerations for these unless you specifically want pods to run on unhealthy nodes.
7. Handle Control Plane Taints
By default, control plane nodes are tainted to prevent workload scheduling.
# Check control plane taints
kubectl describe node <control-plane-node> | grep Taints
# Typically: node-role.kubernetes.io/control-plane:NoSchedule
For single-node clusters or development environments where you want to schedule on the control plane:
# Remove the control plane taint
kubectl taint node <control-plane-node> node-role.kubernetes.io/control-plane:NoSchedule-
For production, add the toleration only to specific system pods that need to run on control plane nodes, not to general workloads.
8. Use Taints with Node Affinity Together
For proper node dedication, use taints (to repel) together with node affinity (to attract).
spec:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values:
- gpu
The taint ensures non-GPU workloads do not run on GPU nodes, and the affinity ensures GPU workloads do run on GPU nodes.
9. Handle NoExecute Taints
NoExecute taints evict running pods in addition to preventing scheduling.
# Check for NoExecute taints
kubectl get nodes -o json | jq '.items[] | {name: .metadata.name, taints: [.spec.taints[] | select(.effect=="NoExecute")]}'
If pods are being evicted due to NoExecute taints, add tolerations with optional tolerationSeconds to control how long the pod can remain before eviction.
tolerations:
- key: "node.kubernetes.io/unreachable"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 60
10. Verify Scheduling Works
After adding tolerations or removing taints, confirm the pod schedules.
# Watch the pod status
kubectl get pod <pod-name> -w
# Verify which node it was scheduled to
kubectl get pod <pod-name> -o wide
# Confirm no more FailedScheduling events
kubectl describe pod <pod-name> | tail -10
The pod should transition from Pending to Running. If it still fails to schedule, check the updated FailedScheduling message — there may be additional constraints (resources, affinity, nodeSelector) preventing scheduling beyond the taint issue.
How to Explain This in an Interview
I would explain the taint and toleration mechanism: taints are applied to nodes as key=value:effect triples, and tolerations are specified in the pod spec. The three effects are NoSchedule (hard constraint — pods without toleration are not scheduled), PreferNoSchedule (soft constraint — scheduler tries to avoid but may schedule), and NoExecute (pods without toleration are evicted from the node). I'd discuss how the kubelet auto-taints nodes with conditions, how to use taints for node dedication and workload isolation, and the operator patterns for Equal and Exists matching. I'd emphasize that taints work as a repelling mechanism while node affinity is an attracting mechanism, and they should be used together for proper node dedication.
Prevention
- Document all node taints and their purposes
- Include tolerations in deployment templates for targeted node pools
- Use admission webhooks to automatically add tolerations based on namespace or labels
- Monitor node taint changes with alerts
- Test scheduling in staging with the same taint configuration as production