How Do You Achieve Zero-Downtime Deployments in Kubernetes?
Zero-downtime deployments require a combination of rolling updates with maxUnavailable: 0, readiness probes, graceful shutdown handling, PodDisruptionBudgets, and preStop hooks. No single setting achieves it -- you need all layers working together.
Detailed Answer
Achieving true zero-downtime deployments in Kubernetes is more nuanced than setting strategy: RollingUpdate. It requires careful configuration at multiple layers: the Deployment strategy, health probes, graceful shutdown, network propagation, and disruption budgets.
The Five Pillars of Zero Downtime
- Rolling update with
maxUnavailable: 0 - Readiness probes
- Graceful shutdown with preStop hooks
- PodDisruptionBudgets
- Application-level connection draining
Pillar 1 -- Rolling Update Strategy
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
With maxUnavailable: 0, Kubernetes never terminates an old Pod until a new one is fully Ready. This guarantees the total number of available Pods never drops below the desired count.
Pillar 2 -- Readiness Probes
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
successThreshold: 1
Without a readiness probe, Kubernetes considers a Pod ready the moment its containers start. Traffic will hit an application that is still initializing, causing errors.
Pillar 3 -- Graceful Shutdown
This is where most teams miss a critical detail. When Kubernetes terminates a Pod, two things happen in parallel:
- The Pod is removed from Service endpoints.
- The Pod receives SIGTERM.
The problem is that kube-proxy and ingress controllers may take a few seconds to update their routing rules. During this window, traffic can still be sent to the terminating Pod.
The solution is a preStop hook that introduces a short delay:
spec:
terminationGracePeriodSeconds: 60
containers:
- name: web-app
image: web-app:2.0
ports:
- containerPort: 8080
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 10"]
The timeline becomes:
t=0s SIGTERM sent + Pod removed from endpoints (parallel)
t=0-10s preStop sleep -- app still running, kube-proxy updates propagate
t=10s preStop finishes, app receives SIGTERM, begins graceful shutdown
t=10-60s App drains connections and exits
t=60s SIGKILL if still running
The 10-second sleep ensures routing rules are updated before the application starts shutting down.
Pillar 4 -- PodDisruptionBudgets
Rolling updates are managed by the Deployment controller. But other operations can disrupt Pods too: node drains, cluster autoscaler, kubectl delete pod.
A PodDisruptionBudget (PDB) prevents too many Pods from being disrupted simultaneously:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: web-app-pdb
spec:
minAvailable: 2 # Or use maxUnavailable: 1
selector:
matchLabels:
app: web-app
With minAvailable: 2 and 3 replicas, at most 1 Pod can be voluntarily disrupted at a time. This protects against:
kubectl drainduring node maintenance- Cluster autoscaler removing nodes
- Spot/preemptible instance termination
Pillar 5 -- Application-Level Connection Draining
Your application must handle SIGTERM gracefully:
# Python example with graceful shutdown
import signal
import sys
from http.server import HTTPServer
server = HTTPServer(('0.0.0.0', 8080), MyHandler)
def graceful_shutdown(signum, frame):
print("Received SIGTERM, draining connections...")
server.shutdown() # Stops accepting new connections, finishes existing ones
sys.exit(0)
signal.signal(signal.SIGTERM, graceful_shutdown)
server.serve_forever()
The key behaviors:
- Stop accepting new connections after receiving SIGTERM.
- Finish processing in-flight requests within the grace period.
- Exit cleanly so Kubernetes does not need to send SIGKILL.
The Complete Zero-Downtime Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3
revisionHistoryLimit: 5
progressDeadlineSeconds: 300
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
terminationGracePeriodSeconds: 60
containers:
- name: web-app
image: web-app:2.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
periodSeconds: 5
failureThreshold: 3
livenessProbe:
httpGet:
path: /healthz
port: 8080
periodSeconds: 10
failureThreshold: 5
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 10"]
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: web-app-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: web-app
The Endpoint Propagation Race Condition
This is the most commonly overlooked issue. Here is the detailed sequence when a Pod is terminated:
1. API server marks Pod for deletion
2. kubelet receives watch event → sends SIGTERM + runs preStop hook
3. endpoints controller receives watch event → removes Pod from Endpoints
4. kube-proxy receives Endpoints update → updates iptables/ipvs rules
5. Ingress controller receives Endpoints update → updates upstream list
Steps 2-5 happen concurrently and asynchronously. Without the preStop sleep, the application might shut down before kube-proxy finishes updating its rules. This causes connection refused errors or 502s for a brief window.
The sleep in the preStop hook gives all components time to propagate the endpoint removal.
Testing Zero Downtime
Use a load testing tool during deployment to verify:
# Terminal 1: Generate continuous traffic
while true; do
curl -s -o /dev/null -w "%{http_code}\n" http://web-app.default.svc.cluster.local/
done | sort | uniq -c
# Terminal 2: Trigger a deployment update
kubectl set image deployment/web-app web-app=web-app:2.1
If you see any non-200 responses during the rollout, one of the five pillars is misconfigured.
Additional Considerations
minReadySeconds: Adds a delay after a Pod is Ready before it counts as Available. Useful as an additional safety buffer:
spec:
minReadySeconds: 10
Topology spread constraints: Ensure Pods are spread across nodes and zones so a single node failure does not take down the service:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: web-app
Summary
True zero-downtime deployments require all five pillars working together: a rolling update with maxUnavailable: 0, readiness probes to gate traffic, preStop hooks to handle the endpoint propagation race condition, PodDisruptionBudgets to protect against voluntary disruptions, and application-level graceful shutdown. Missing any one of these layers can cause brief outages during otherwise well-configured deployments.
Why Interviewers Ask This
Zero-downtime deployment is a production requirement for most organizations. This question tests whether you understand the full stack of configurations needed, not just the basics.
Common Follow-Up Questions
Key Takeaways
- Zero downtime requires multiple overlapping configurations, not just a rolling update.
- The preStop hook solves the race condition between endpoint removal and iptables updates.
- PodDisruptionBudgets protect against voluntary disruptions like node drains.