How Does Graceful Shutdown Work in Kubernetes?
Graceful shutdown in Kubernetes is the process of terminating a Pod without dropping in-flight requests. It involves PreStop hooks, SIGTERM signal handling, terminationGracePeriodSeconds, and coordinating with Service endpoint removal.
Detailed Answer
Graceful shutdown is the process of terminating a Pod in a way that allows it to finish processing in-flight work without dropping requests. Getting this right is essential for zero-downtime deployments, autoscaling events, and node maintenance.
The Termination Sequence
When Kubernetes decides to terminate a Pod (rolling update, scale-down, kubectl delete, node drain), the following steps occur:
- Pod status set to Terminating — the API server updates the Pod's metadata
- Endpoint removal begins — the Endpoints controller removes the Pod from Service endpoints, and kube-proxy starts updating iptables/IPVS rules on all nodes
- PreStop hook fires — runs inside the container (if defined)
- SIGTERM sent — after the PreStop hook completes, kubelet sends SIGTERM to PID 1
- Grace period countdown — the
terminationGracePeriodSecondstimer started at step 1 - SIGKILL sent — if the container is still running when the grace period expires
Steps 2 and 3 happen in parallel, which creates a critical race condition.
The Race Condition Problem
Timeline:
0s Pod marked Terminating
├── kube-proxy starts removing endpoints (takes 1-10+ seconds)
└── PreStop hook starts (if defined)
├── SIGTERM sent
├── App starts draining
└── App exits
Problem: If app exits before all kube-proxy instances update,
some nodes still route traffic to the dead Pod → 502 errors
The Solution: PreStop Sleep
A PreStop sleep gives kube-proxy enough time to propagate endpoint removal across all nodes:
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
spec:
replicas: 3
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
terminationGracePeriodSeconds: 60
containers:
- name: api
image: myapi:2.0
ports:
- containerPort: 8080
lifecycle:
preStop:
sleep:
seconds: 10
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
The 10-second sleep ensures kube-proxy updates are complete before SIGTERM reaches the application. The total grace period (60s) must be large enough to cover the sleep plus the application's drain time.
Application-Side SIGTERM Handling
Your application must handle SIGTERM properly. Here is the general pattern:
# Python example
import signal
import sys
def handle_sigterm(signum, frame):
print("SIGTERM received, starting graceful shutdown")
# 1. Stop accepting new connections
server.stop_accepting()
# 2. Wait for in-flight requests to complete
server.drain(timeout=30)
# 3. Close database connections
db.close()
# 4. Exit cleanly
sys.exit(0)
signal.signal(signal.SIGTERM, handle_sigterm)
Common Frameworks and SIGTERM
| Framework | Default Behavior | Notes |
|-----------|-----------------|-------|
| Go net/http | Does NOT handle SIGTERM | Use http.Server.Shutdown() |
| Node.js | Process exits immediately | Register process.on('SIGTERM', ...) |
| Spring Boot | Graceful shutdown available | Set server.shutdown=graceful |
| Nginx | Stops accepting, drains | nginx -s quit handles it well |
Calculating terminationGracePeriodSeconds
terminationGracePeriodSeconds = preStop sleep
+ max application drain time
+ safety buffer
Example: 10s sleep + 30s drain + 5s buffer = 45s
Always set this value explicitly rather than relying on the 30-second default.
PodDisruptionBudgets and Graceful Shutdown
For voluntary disruptions (node drain, cluster upgrades), PodDisruptionBudgets (PDBs) control how many Pods can be down simultaneously:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: api
This ensures at least 2 replicas remain available during disruptions, giving each Pod time to shut down gracefully before the next one is terminated.
Debugging Shutdown Issues
# Watch termination in real time
kubectl delete pod api-xyz --grace-period=60 &
kubectl get pod api-xyz -w
# Check if the app is handling SIGTERM
kubectl logs api-xyz --previous
# Test locally with Docker
docker stop --time 30 <container-id>
Checklist for Production-Ready Graceful Shutdown
- Application handles SIGTERM and drains in-flight requests
- PreStop hook with 5-15 second sleep to handle endpoint propagation race
terminationGracePeriodSecondsset to cover PreStop + drain + buffer- PodDisruptionBudget configured to prevent simultaneous termination
- Health checks (readiness probe) return failure during drain to stop new traffic
- Connection pools and database handles are closed before exit
Why Interviewers Ask This
Interviewers ask this to gauge your understanding of zero-downtime operations. Production incidents often stem from applications that do not handle termination signals correctly.
Common Follow-Up Questions
Key Takeaways
- Kubernetes sends SIGTERM first, then SIGKILL after the grace period — your app must handle SIGTERM.
- A PreStop sleep of 5-15 seconds mitigates the race between endpoint removal and container termination.
- Set terminationGracePeriodSeconds based on your application's drain time, not just the default 30 seconds.