How Do Liveness Probes Work in Kubernetes?
A liveness probe tells the kubelet whether a container is still running correctly. If the probe fails consecutively beyond the failure threshold, the kubelet kills the container and restarts it according to the Pod's restart policy.
Detailed Answer
A liveness probe is a diagnostic check that the kubelet performs periodically on a container to determine whether it is still healthy. If the probe fails a configured number of consecutive times, the kubelet kills the container and restarts it.
Why Liveness Probes Exist
Some applications enter a broken state where the process is still running but cannot serve requests -- for example, a deadlocked thread pool or a corrupted in-memory cache. Without a liveness probe, Kubernetes has no way to detect this condition because the container's process ID is still active.
Probe Mechanisms
Kubernetes supports three probe types:
HTTP GET Probe
The kubelet sends an HTTP GET request to a specified path and port. Any response code between 200 and 399 is considered healthy.
apiVersion: v1
kind: Pod
metadata:
name: web-app
spec:
containers:
- name: app
image: myapp/server:2.1
ports:
- containerPort: 8080
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
successThreshold: 1
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
TCP Socket Probe
The kubelet attempts to open a TCP connection to the specified port. If the connection succeeds, the container is healthy.
livenessProbe:
tcpSocket:
port: 3306
initialDelaySeconds: 30
periodSeconds: 10
This is useful for databases or services that do not expose HTTP endpoints.
Exec Probe
The kubelet runs a command inside the container. If the command returns exit code 0, the container is healthy.
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
Configuration Parameters
| Parameter | Default | Description |
|-----------|---------|-------------|
| initialDelaySeconds | 0 | Seconds to wait after container start before the first probe |
| periodSeconds | 10 | How often (in seconds) to perform the probe |
| timeoutSeconds | 1 | Seconds before the probe times out |
| failureThreshold | 3 | Consecutive failures before the container is killed |
| successThreshold | 1 | Consecutive successes to mark the container as healthy (must be 1 for liveness) |
What Happens When a Liveness Probe Fails
- The kubelet marks the probe as failed.
- After
failureThresholdconsecutive failures, the kubelet kills the container. - The container is restarted according to the Pod's
restartPolicy(usuallyAlwaysfor Deployment-managed Pods). - The restart count increments, visible via
kubectl get podsin the RESTARTS column. - If the container keeps failing, Kubernetes applies exponential backoff (CrashLoopBackOff), delaying restarts up to 5 minutes.
# Check liveness probe status and restart count
kubectl describe pod web-app
kubectl get pod web-app -o jsonpath='{.status.containerStatuses[0].restartCount}'
Common Mistakes
Checking External Dependencies
Never check a database or downstream service in your liveness probe. If the database goes down, all your Pods will be restarted simultaneously, causing a cascading outage and making recovery harder.
# BAD -- checks external dependency
livenessProbe:
httpGet:
path: /healthz?check=database
port: 8080
# GOOD -- checks only local process health
livenessProbe:
httpGet:
path: /healthz
port: 8080
Missing initialDelaySeconds Without a Startup Probe
If your application takes 60 seconds to start and the liveness probe begins checking at second 0, the container will be killed before it is ready. Either set initialDelaySeconds appropriately or use a startup probe (the preferred approach).
Probe Timeout Too Short
Setting timeoutSeconds: 1 on an endpoint that occasionally takes 2 seconds under load causes unnecessary restarts. Test your health endpoint's latency under realistic conditions.
Liveness Probes and gRPC
Starting with Kubernetes 1.27 (stable), you can use native gRPC health probes:
livenessProbe:
grpc:
port: 50051
service: "myapp.health.v1.Health"
initialDelaySeconds: 10
periodSeconds: 10
This requires your application to implement the gRPC Health Checking Protocol.
Best Practices
- Keep the health endpoint lightweight -- it should return quickly and not trigger expensive operations.
- Use startup probes for slow-starting apps instead of large
initialDelaySecondsvalues. - Only check local state in liveness probes -- thread pool health, memory corruption flags, internal deadlock detection.
- Set reasonable timeouts -- at least 2-3x your endpoint's p99 latency.
- Monitor restart counts -- frequent restarts indicate a misconfigured probe or an underlying application bug.
Why Interviewers Ask This
Interviewers ask this to evaluate your understanding of self-healing in Kubernetes. Misconfigured liveness probes are a common source of production incidents, so knowing the nuances matters.
Common Follow-Up Questions
Key Takeaways
- Liveness probes detect when a container is deadlocked or broken and trigger automatic restarts.
- Always set initialDelaySeconds or use a startup probe to avoid killing slow-starting containers.
- Never check external dependencies in a liveness probe -- only check the container's own health.