Kubernetes Pod Pending

Causes and Fixes

A pod in the Pending state has been accepted by the cluster but is not yet running. This usually means the scheduler cannot find a suitable node due to insufficient resources, unsatisfied constraints, or missing dependencies like PersistentVolumeClaims.

Symptoms

  • Pod status shows Pending in kubectl get pods output
  • Pod has been in Pending state for an extended period
  • kubectl describe pod shows scheduling-related events or no events at all
  • Events may show 'Insufficient cpu', 'Insufficient memory', or 'no nodes available'
  • FailedScheduling events in the pod's event list

Common Causes

1
Insufficient cluster resources
No node has enough allocatable CPU or memory to satisfy the pod's resource requests. Scale up the cluster or reduce requests.
2
Node selector or affinity mismatch
The pod's nodeSelector, nodeAffinity, or podAffinity rules cannot be satisfied by any available node. Check labels on nodes.
3
Taints and tolerations
All nodes have taints that the pod does not tolerate. Add tolerations to the pod or remove taints from nodes.
4
Unbound PersistentVolumeClaim
The pod references a PVC that has not been bound to a PersistentVolume. Check PVC status and storage class provisioner.
5
ResourceQuota exceeded
The namespace has a ResourceQuota that has been exhausted. Check quota usage with kubectl describe resourcequota.
6
Scheduler not running
The kube-scheduler component is down or misconfigured. Check scheduler pods in kube-system namespace.

Step-by-Step Troubleshooting

1. Check Pod Events

The scheduler writes events explaining why a pod cannot be scheduled.

kubectl describe pod <pod-name> -n <namespace>

Look at the Events section for messages like:

0/5 nodes are available: 2 Insufficient cpu, 3 Insufficient memory.
0/5 nodes are available: 5 node(s) didn't match Pod's node affinity/selector.
0/5 nodes are available: 5 node(s) had taint {key: NoSchedule}, that the pod didn't tolerate.
persistentvolumeclaim "my-pvc" not found

If there are no events at all, the scheduler may not be running.

2. Check Cluster Resource Availability

See how much capacity is available across nodes.

# Check resource usage per node
kubectl top nodes

# Check allocatable vs allocated resources
kubectl describe nodes | grep -A5 "Allocated resources"

# Detailed view for a specific node
kubectl describe node <node-name>

Compare the pod's resource requests against available allocatable resources:

# Check the pending pod's resource requests
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[*].resources.requests}'

3. Check Node Selectors and Affinity Rules

If the pod has scheduling constraints, verify nodes match them.

# Check pod's node selector
kubectl get pod <pod-name> -o jsonpath='{.spec.nodeSelector}'

# Check pod's node affinity
kubectl get pod <pod-name> -o jsonpath='{.spec.affinity}'

# List nodes with their labels
kubectl get nodes --show-labels

# Check if any nodes match a specific label
kubectl get nodes -l <label-key>=<label-value>

If no nodes have the required labels, add them:

kubectl label node <node-name> <label-key>=<label-value>

4. Check Taints and Tolerations

Nodes may have taints that prevent scheduling.

# List all node taints
kubectl get nodes -o json | jq -r '.items[] | "\(.metadata.name): \(.spec.taints // [] | map(.key + "=" + (.value // "") + ":" + .effect) | join(", "))"'

# Check the pod's tolerations
kubectl get pod <pod-name> -o jsonpath='{.spec.tolerations}' | jq .

If the pod needs to run on a tainted node, add a toleration:

spec:
  tolerations:
    - key: "dedicated"
      operator: "Equal"
      value: "special"
      effect: "NoSchedule"

5. Check PersistentVolumeClaims

If the pod mounts a PVC, verify it is bound.

# Check PVC status
kubectl get pvc -n <namespace>

# Describe the PVC for details
kubectl describe pvc <pvc-name> -n <namespace>

Common PVC issues:

  • Pending PVC: The storage class provisioner may be broken or no matching PV exists
  • Storage class does not exist: Check kubectl get storageclass
  • Zone mismatch: The PV is in a different availability zone than the node
# Check storage class provisioner
kubectl get storageclass
kubectl describe storageclass <class-name>

# Check if the provisioner pods are running
kubectl get pods -n kube-system | grep -i provisioner

6. Check ResourceQuotas

Namespace quotas may prevent the pod from being created.

# Check quota in the namespace
kubectl describe resourcequota -n <namespace>

# Check LimitRanges
kubectl describe limitrange -n <namespace>

If the quota is exhausted:

Name:       compute-quota
Resource    Used    Hard
--------    ----    ----
cpu         4       4
memory      8Gi     8Gi

Options: increase the quota, reduce requests on existing pods, or deploy to a different namespace.

7. Verify the Scheduler is Running

If the pod has no scheduling events at all, check the scheduler.

# Check scheduler pods
kubectl get pods -n kube-system -l component=kube-scheduler

# Check scheduler logs
kubectl logs -n kube-system -l component=kube-scheduler --tail=50

8. Check for Pod Topology Spread Constraints

If the pod uses topology spread constraints, ensure they can be satisfied.

kubectl get pod <pod-name> -o jsonpath='{.spec.topologySpreadConstraints}' | jq .

If whenUnsatisfiable: DoNotSchedule is set and the constraint cannot be met, the pod stays Pending. Consider using ScheduleAnyway during scaling events.

9. Apply the Fix

Based on your findings:

# Scale up the cluster (cloud-specific)
# Add more nodes to satisfy resource requests

# Remove a taint
kubectl taint node <node-name> <key>:<effect>-

# Fix node selector
kubectl patch deployment <deploy-name> --type=json \
  -p='[{"op": "replace", "path": "/spec/template/spec/nodeSelector", "value": {"zone": "us-east-1a"}}]'

# Increase resource quota
kubectl edit resourcequota <quota-name> -n <namespace>

10. Verify Resolution

# Watch the pod start
kubectl get pod <pod-name> -w

# Verify it is running on the expected node
kubectl get pod <pod-name> -o wide

The pod should transition from Pending to ContainerCreating to Running.

How to Explain This in an Interview

I would explain that Pending is a normal transitional state, but if a pod stays Pending, it means the scheduler cannot place it. I would walk through the diagnostic steps: first check events with kubectl describe pod, then look at resource availability with kubectl describe nodes, and examine any scheduling constraints like nodeSelector, affinity rules, taints, and PVCs. I would also mention that the scheduler considers both requests (not limits) when making placement decisions.

Prevention

  • Use Cluster Autoscaler to add nodes when resources are exhausted
  • Set realistic resource requests based on actual usage
  • Monitor namespace ResourceQuota usage and alert before exhaustion
  • Test scheduling constraints in staging before production
  • Use Pod Disruption Budgets to maintain capacity during node maintenance

Related Errors