What Are the Pod Restart Policies in Kubernetes?

Q: What Are the Pod Restart Policies in Kubernetes?

Kubernetes offers three restart policies -- Always, OnFailure, and Never -- that control whether the kubelet restarts containers when they exit. The policy applies to all containers in the Pod and determines behavior after both normal exits and failures.

Detailed Answer

The restartPolicy field in a Pod spec tells the kubelet what to do when a container exits. It is set at the Pod level and applies to all containers (including init containers) in the Pod.

The Three Restart Policies

Always (Default)

The kubelet restarts the container regardless of its exit code. This is the default policy and is required for Pods managed by Deployments, StatefulSets, and DaemonSets.

apiVersion: v1
kind: Pod
metadata:
  name: web-server
spec:
  restartPolicy: Always
  containers:
    - name: nginx
      image: nginx:1.27
      ports:
        - containerPort: 80
      resources:
        requests:
          cpu: "100m"
          memory: "128Mi"
        limits:
          cpu: "250m"
          memory: "256Mi"

Use Always for long-running services that should never stop. If the container exits with code 0 (success), it is restarted. If it exits with a non-zero code (failure), it is also restarted.

OnFailure

The kubelet restarts the container only if it exits with a non-zero exit code. If the container exits successfully (code 0), it is not restarted.

apiVersion: batch/v1
kind: Job
metadata:
  name: data-processor
spec:
  template:
    spec:
      restartPolicy: OnFailure
      containers:
        - name: processor
          image: myapp/processor:1.0
          command: ["./process", "--input=/data/input.csv"]
          resources:
            requests:
              cpu: "500m"
              memory: "512Mi"
            limits:
              cpu: "1"
              memory: "1Gi"

Use OnFailure for batch Jobs where you want failed containers to retry in the same Pod. The advantage over Never is that the Pod stays on the same node, preserving local data in emptyDir volumes.

Never

The kubelet never restarts the container, regardless of exit code. Once all containers terminate, the Pod transitions to Succeeded (all exited with 0) or Failed (any exited with non-zero).

apiVersion: batch/v1
kind: Job
metadata:
  name: one-shot-task
spec:
  backoffLimit: 3
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: task
          image: myapp/task-runner:1.0
          command: ["./run-task"]
          resources:
            requests:
              cpu: "250m"
              memory: "256Mi"

With Never, the Job controller creates a new Pod for each retry attempt. This is useful when you want a clean environment for each retry or when debugging failed Pods (they remain in Failed state for log inspection).

Restart Policy and Controller Compatibility

| Controller | Allowed Restart Policies | Typical Choice | |-----------|------------------------|----------------| | Deployment | Always only | Always | | StatefulSet | Always only | Always | | DaemonSet | Always only | Always | | Job | OnFailure or Never | OnFailure | | CronJob | OnFailure or Never | OnFailure | | Standalone Pod | Any | Depends on use case |

CrashLoopBackOff Explained

When a container with restartPolicy: Always or OnFailure keeps crashing, the kubelet applies exponential backoff:

First restart: immediate
Second restart: 10 seconds delay
Third restart: 20 seconds
Fourth restart: 40 seconds
Continues doubling up to 5 minutes maximum

During this backoff period, the container's state shows as CrashLoopBackOff. This is not a Pod phase -- it is a container waiting state within a Running Pod.

# See the CrashLoopBackOff state
kubectl get pods
# NAME         READY   STATUS             RESTARTS      AGE
# my-pod       0/1     CrashLoopBackOff   5 (2m ago)    10m

# Check container state details
kubectl get pod my-pod -o jsonpath='{.status.containerStatuses[0].state}'

# View logs from the most recent crash
kubectl logs my-pod --previous

How Restart Policy Interacts with Probes

Liveness probe failure: The kubelet kills the container and restarts it according to the restart policy. With Never, the container stays dead.
Startup probe failure: Same behavior as liveness probe failure.
Readiness probe failure: No restart -- the Pod is just removed from Service endpoints.

Init Container Restart Behavior

Init containers follow special rules:

With Always or OnFailure: A failed init container is retried until it succeeds. The Pod stays in Init state.
With Never: A failed init container causes the entire Pod to fail immediately.
Init containers that succeed are never restarted, regardless of the restart policy.

Practical Debugging Flow

# Step 1: Check Pod status and restart count
kubectl get pod my-pod

# Step 2: Check why the container exited
kubectl describe pod my-pod
# Look for "Last State: Terminated" with exit code and reason

# Step 3: Check logs from the crashed container
kubectl logs my-pod --previous

# Step 4: For OOMKilled, check resource limits
kubectl get pod my-pod -o jsonpath='{.spec.containers[0].resources}'

Best Practices

Use Always for services -- Deployments require it, and it ensures self-healing.
Use OnFailure for Jobs when you want in-place retries and need to preserve emptyDir data.
Use Never for debugging Jobs so you can inspect failed Pods and their logs.
Monitor restart counts -- high restart counts indicate application bugs or misconfigured resources.
Set appropriate Job backoffLimit alongside restart policies to control total retry attempts.
Investigate CrashLoopBackOff immediately -- it usually indicates a fundamental issue (missing config, wrong image, insufficient resources) that will not self-resolve.