Kubernetes ProgressDeadlineExceeded
Causes and Fixes
ProgressDeadlineExceeded occurs when a Deployment fails to make progress within the specified progressDeadlineSeconds (default 600 seconds). The Deployment controller marks the rollout as failed, and the Deployment condition Progressing is set to False with this reason.
Symptoms
- Deployment status shows ProgressDeadlineExceeded in kubectl describe deployment
- kubectl rollout status reports 'error: deployment exceeded its progress deadline'
- New ReplicaSet has fewer ready replicas than desired
- Old ReplicaSet still has pods running because the new one cannot fully roll out
- Deployment condition Progressing is False
Common Causes
Step-by-Step Troubleshooting
When a Kubernetes Deployment reports ProgressDeadlineExceeded, the rollout has stalled. The Deployment controller could not bring the new ReplicaSet's pods to a ready state within the allowed time. This guide walks through finding the root cause and recovering from the failed rollout.
1. Confirm the Deployment Status
Start by checking the Deployment's current state and conditions.
kubectl get deployment <deployment-name> -o wide
kubectl describe deployment <deployment-name>
In the output of kubectl describe, look at the Conditions section. You will see a condition like:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
Progressing False ProgressDeadlineExceeded
Also check the rollout status directly.
kubectl rollout status deployment/<deployment-name>
This will report whether the rollout is complete, in progress, or has failed with the deadline exceeded error.
2. Inspect the ReplicaSets
The Deployment manages ReplicaSets. Identify which ReplicaSet is new (the one trying to roll out) and which is old.
kubectl get replicaset -l app=<app-label> -o wide
# See the revision history
kubectl rollout history deployment/<deployment-name>
The new ReplicaSet will have fewer ready replicas than desired. Note its name for further investigation.
kubectl describe replicaset <new-replicaset-name>
Check the Events section of the ReplicaSet for errors like FailedCreate, which may indicate quota issues or other creation failures.
3. Check the Pods in the New ReplicaSet
The most common cause is that new pods are not becoming ready. List the pods belonging to the new ReplicaSet.
kubectl get pods -l app=<app-label> --sort-by=.metadata.creationTimestamp
# Or directly using the ReplicaSet's selector
kubectl get pods -l pod-template-hash=<hash-from-replicaset>
Check the status of these pods. Common stuck states include:
- Pending: Resource issues or scheduling failures
- ContainerCreating: Image pull or volume mount issues
- CrashLoopBackOff: Application is crashing
- Running but not Ready: Readiness probe is failing
Describe a problematic pod for detailed events.
kubectl describe pod <pod-name>
4. Check Pod Logs
If pods are starting but crashing or not passing readiness probes, the logs will reveal why.
# Current logs
kubectl logs <pod-name>
# Previous container logs if the pod has restarted
kubectl logs <pod-name> --previous
# Follow logs in real time
kubectl logs <pod-name> -f
Look for application startup errors, connection failures to dependencies, configuration problems, or health check endpoint failures.
5. Check Resource Availability
If pods are stuck in Pending, the issue may be insufficient cluster resources.
# Check node resource allocation
kubectl describe nodes | grep -A5 "Allocated resources"
# Check if there are any resource quotas in the namespace
kubectl get resourcequota -n <namespace>
kubectl describe resourcequota -n <namespace>
# Check the scheduler events for the pending pods
kubectl get events --sort-by=.lastTimestamp --field-selector reason=FailedScheduling
6. Roll Back the Deployment
If you need to restore service immediately while investigating, roll back to the previous working version.
# Roll back to the previous revision
kubectl rollout undo deployment/<deployment-name>
# Roll back to a specific revision
kubectl rollout history deployment/<deployment-name>
kubectl rollout undo deployment/<deployment-name> --to-revision=<revision-number>
# Verify the rollback succeeded
kubectl rollout status deployment/<deployment-name>
Rolling back does not fix the underlying issue — it restores the old configuration while you investigate what is wrong with the new one.
7. Adjust progressDeadlineSeconds If Needed
If your application legitimately takes a long time to start (for example, a Java application with a large heap that needs to warm up), you may need to increase the deadline.
# Check current setting
kubectl get deployment <deployment-name> -o jsonpath='{.spec.progressDeadlineSeconds}'
# Update the deadline (example: 15 minutes)
kubectl patch deployment <deployment-name> -p '{"spec":{"progressDeadlineSeconds":900}}'
The default is 600 seconds (10 minutes). Set this to a value that gives your application enough time to start and pass readiness probes, with a reasonable buffer.
8. Check Readiness Probes
If pods are Running but not Ready, the readiness probe configuration may be the issue.
# Check the probe configuration
kubectl get deployment <deployment-name> -o jsonpath='{.spec.template.spec.containers[0].readinessProbe}' | jq .
# Test the probe endpoint manually from inside the pod
kubectl exec <pod-name> -- curl -s http://localhost:8080/healthz
kubectl exec <pod-name> -- wget -qO- http://localhost:8080/ready
Common readiness probe issues:
- Wrong port or path
- Probe starts checking before the application is initialized (use initialDelaySeconds or a startup probe)
- Timeout too short for the endpoint's response time
- The application's health endpoint depends on an unavailable external service
9. Fix the Issue and Retry the Rollout
Once you have identified and fixed the root cause (updated the image, fixed configuration, freed resources, or adjusted probes), apply the updated Deployment.
# Apply the corrected deployment
kubectl apply -f deployment.yaml
# Or update a specific field
kubectl set image deployment/<deployment-name> <container-name>=<new-image>:<tag>
# Watch the rollout progress
kubectl rollout status deployment/<deployment-name> --watch
10. Verify the Deployment Is Healthy
After a successful rollout, confirm everything is running correctly.
# All replicas should be ready
kubectl get deployment <deployment-name>
# Check that the Progressing condition is True with reason NewReplicaSetAvailable
kubectl get deployment <deployment-name> -o jsonpath='{.status.conditions[?(@.type=="Progressing")]}'
# Verify pods are running and ready
kubectl get pods -l app=<app-label>
# Check there are no lingering old ReplicaSets with active pods
kubectl get replicaset -l app=<app-label>
A healthy Deployment will show the desired number of ready replicas, the Progressing condition as True with reason NewReplicaSetAvailable, and only the current ReplicaSet will have active pods while old ReplicaSets will be scaled to zero.
How to Explain This in an Interview
I would explain that ProgressDeadlineExceeded is a Deployment-level condition that acts as a safety net to detect stalled rollouts. The Deployment controller tracks whether new pods are becoming available and if no progress is made within progressDeadlineSeconds, the condition is set. I'd clarify that this does not automatically roll back the Deployment — it only marks the rollout as failed. I would walk through how to diagnose the underlying issue by examining the new ReplicaSet's pods, and discuss strategies like adjusting the deadline, using rollback commands, and implementing proper readiness probes.
Prevention
- Set progressDeadlineSeconds appropriately for your application's startup time
- Configure readiness probes that accurately reflect application health
- Ensure sufficient cluster capacity before deploying updates
- Use resource requests to guarantee scheduling
- Test deployments in staging environments first