Kubernetes CrashLoopBackOff
Causes and Fixes
CrashLoopBackOff means a container in the Pod is repeatedly crashing and Kubernetes is restarting it with exponential backoff delays. It is the most common Pod error and indicates the container process is exiting with a non-zero code.
Symptoms
- Pod status shows CrashLoopBackOff in kubectl get pods output
- Container restarts count keeps incrementing
- Pod events show 'Back-off restarting failed container'
- Container logs show application errors before the crash
Common Causes
Step-by-Step Troubleshooting
1. Check Pod Status and Events
kubectl describe pod <pod-name>
Look at the Events section at the bottom — it shows why the container is restarting.
2. Check Container Logs
# Current container logs (may be empty if crash is immediate)
kubectl logs <pod-name>
# Previous container's logs (shows what happened before the crash)
kubectl logs <pod-name> --previous
3. Check Exit Code
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].lastState.terminated.exitCode}'
- Exit code 1: Application error
- Exit code 137: OOM killed (SIGKILL) — increase memory limits
- Exit code 139: Segmentation fault
- Exit code 143: Graceful termination (SIGTERM)
4. Verify Configuration
# Check if ConfigMaps and Secrets exist
kubectl get configmap <name>
kubectl get secret <name>
# Check environment variables
kubectl set env pod/<pod-name> --list
5. Debug with Ephemeral Container
kubectl debug <pod-name> -it --image=busybox -- sh
How to Explain This in an Interview
I would explain that CrashLoopBackOff is not actually an error state but a restart policy behavior — the kubelet is doing its job by restarting the crashing container with exponential backoff. The key is diagnosing why the container crashes by checking logs with kubectl logs --previous, inspecting events with kubectl describe pod, and verifying configuration. I'd walk through the systematic debugging steps and mention that this is often caused by misconfiguration rather than code bugs.
Prevention
- Always set resource requests and limits appropriately
- Use startup probes for slow-starting applications
- Validate configuration before deployment with dry-run
- Implement graceful startup with proper dependency health checks
- Use init containers to wait for dependencies