What Are the Pod Restart Policies in Kubernetes?
Kubernetes offers three restart policies -- Always, OnFailure, and Never -- that control whether the kubelet restarts containers when they exit. The policy applies to all containers in the Pod and determines behavior after both normal exits and failures.
Detailed Answer
The restartPolicy field in a Pod spec tells the kubelet what to do when a container exits. It is set at the Pod level and applies to all containers (including init containers) in the Pod.
The Three Restart Policies
Always (Default)
The kubelet restarts the container regardless of its exit code. This is the default policy and is required for Pods managed by Deployments, StatefulSets, and DaemonSets.
apiVersion: v1
kind: Pod
metadata:
name: web-server
spec:
restartPolicy: Always
containers:
- name: nginx
image: nginx:1.27
ports:
- containerPort: 80
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "250m"
memory: "256Mi"
Use Always for long-running services that should never stop. If the container exits with code 0 (success), it is restarted. If it exits with a non-zero code (failure), it is also restarted.
OnFailure
The kubelet restarts the container only if it exits with a non-zero exit code. If the container exits successfully (code 0), it is not restarted.
apiVersion: batch/v1
kind: Job
metadata:
name: data-processor
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: processor
image: myapp/processor:1.0
command: ["./process", "--input=/data/input.csv"]
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1"
memory: "1Gi"
Use OnFailure for batch Jobs where you want failed containers to retry in the same Pod. The advantage over Never is that the Pod stays on the same node, preserving local data in emptyDir volumes.
Never
The kubelet never restarts the container, regardless of exit code. Once all containers terminate, the Pod transitions to Succeeded (all exited with 0) or Failed (any exited with non-zero).
apiVersion: batch/v1
kind: Job
metadata:
name: one-shot-task
spec:
backoffLimit: 3
template:
spec:
restartPolicy: Never
containers:
- name: task
image: myapp/task-runner:1.0
command: ["./run-task"]
resources:
requests:
cpu: "250m"
memory: "256Mi"
With Never, the Job controller creates a new Pod for each retry attempt. This is useful when you want a clean environment for each retry or when debugging failed Pods (they remain in Failed state for log inspection).
Restart Policy and Controller Compatibility
| Controller | Allowed Restart Policies | Typical Choice | |-----------|------------------------|----------------| | Deployment | Always only | Always | | StatefulSet | Always only | Always | | DaemonSet | Always only | Always | | Job | OnFailure or Never | OnFailure | | CronJob | OnFailure or Never | OnFailure | | Standalone Pod | Any | Depends on use case |
CrashLoopBackOff Explained
When a container with restartPolicy: Always or OnFailure keeps crashing, the kubelet applies exponential backoff:
- First restart: immediate
- Second restart: 10 seconds delay
- Third restart: 20 seconds
- Fourth restart: 40 seconds
- Continues doubling up to 5 minutes maximum
During this backoff period, the container's state shows as CrashLoopBackOff. This is not a Pod phase -- it is a container waiting state within a Running Pod.
# See the CrashLoopBackOff state
kubectl get pods
# NAME READY STATUS RESTARTS AGE
# my-pod 0/1 CrashLoopBackOff 5 (2m ago) 10m
# Check container state details
kubectl get pod my-pod -o jsonpath='{.status.containerStatuses[0].state}'
# View logs from the most recent crash
kubectl logs my-pod --previous
How Restart Policy Interacts with Probes
- Liveness probe failure: The kubelet kills the container and restarts it according to the restart policy. With
Never, the container stays dead. - Startup probe failure: Same behavior as liveness probe failure.
- Readiness probe failure: No restart -- the Pod is just removed from Service endpoints.
Init Container Restart Behavior
Init containers follow special rules:
- With
AlwaysorOnFailure: A failed init container is retried until it succeeds. The Pod stays inInitstate. - With
Never: A failed init container causes the entire Pod to fail immediately. - Init containers that succeed are never restarted, regardless of the restart policy.
Practical Debugging Flow
# Step 1: Check Pod status and restart count
kubectl get pod my-pod
# Step 2: Check why the container exited
kubectl describe pod my-pod
# Look for "Last State: Terminated" with exit code and reason
# Step 3: Check logs from the crashed container
kubectl logs my-pod --previous
# Step 4: For OOMKilled, check resource limits
kubectl get pod my-pod -o jsonpath='{.spec.containers[0].resources}'
Best Practices
- Use Always for services -- Deployments require it, and it ensures self-healing.
- Use OnFailure for Jobs when you want in-place retries and need to preserve emptyDir data.
- Use Never for debugging Jobs so you can inspect failed Pods and their logs.
- Monitor restart counts -- high restart counts indicate application bugs or misconfigured resources.
- Set appropriate Job backoffLimit alongside restart policies to control total retry attempts.
- Investigate CrashLoopBackOff immediately -- it usually indicates a fundamental issue (missing config, wrong image, insufficient resources) that will not self-resolve.
Why Interviewers Ask This
Interviewers ask this to assess your understanding of workload lifecycle management. Knowing which restart policy to use for different workload types (long-running services vs. batch jobs) is fundamental operational knowledge.
Common Follow-Up Questions
Key Takeaways
- Always (default) -- restarts containers regardless of exit code. Used by Deployments, StatefulSets, DaemonSets.
- OnFailure -- restarts only on non-zero exit codes. Used by Jobs for in-place retries.
- Never -- no restarts. The Pod transitions to Failed or Succeeded based on exit codes.