What Are Startup Probes and When Should You Use Them?
A startup probe tells the kubelet whether a container's application has finished starting. While the startup probe is active, liveness and readiness probes are disabled. This prevents slow-starting containers from being killed by aggressive liveness probes before they are initialized.
Detailed Answer
A startup probe is a Kubernetes probe mechanism that protects slow-starting containers from being terminated by liveness probes before they are fully initialized. It was introduced as stable in Kubernetes 1.20 to solve a longstanding problem with the interaction between application initialization and health checking.
The Problem Startup Probes Solve
Consider a Java application that takes 90 seconds to start up. If you configure a liveness probe with initialDelaySeconds: 10 and failureThreshold: 3 with periodSeconds: 10, the container will be killed after 40 seconds (10 + 3*10), long before it finishes starting.
Before startup probes existed, the workaround was to set a very large initialDelaySeconds on the liveness probe. But this created a blind spot: if the container deadlocked after startup, Kubernetes would not detect it for a long time.
Startup probes solve this cleanly by separating the "is it done starting?" check from the "is it still alive?" check.
How Startup Probes Work
- When a container starts, the kubelet begins running the startup probe.
- Liveness and readiness probes are disabled while the startup probe is running.
- If the startup probe succeeds, it is permanently disabled and liveness/readiness probes activate.
- If the startup probe fails beyond the
failureThreshold, the container is killed.
Configuration Example
apiVersion: v1
kind: Pod
metadata:
name: java-app
spec:
containers:
- name: app
image: myapp/java-server:3.0
ports:
- containerPort: 8080
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10
# Total startup budget: 30 * 10 = 300 seconds (5 minutes)
livenessProbe:
httpGet:
path: /healthz
port: 8080
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
periodSeconds: 5
failureThreshold: 3
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1"
memory: "1Gi"
In this configuration:
- The startup probe gives the application up to 300 seconds (30 failures * 10 seconds) to start.
- Once
/healthzreturns a success code, the startup probe stops and the liveness probe takes over with a much tighter 30-second detection window (3 failures * 10 seconds). - The readiness probe begins simultaneously, controlling when the Pod receives traffic.
Calculating the Startup Budget
The maximum time a container has to start is:
failureThreshold * periodSeconds
Set this to the worst-case startup time for your application, plus a safety margin. Common values:
| Application Type | Typical Startup | Suggested Budget | |-----------------|----------------|-----------------| | Go / Rust microservice | 1-5s | 30s | | Node.js / Python service | 5-15s | 60s | | Java / Spring Boot | 30-120s | 300s | | ML model loading | 60-600s | 900s |
All Three Probes Working Together
Here is the timeline of how probes interact:
Container starts
|
v
[Startup Probe runs] -- liveness and readiness are DISABLED
|
| (startup probe succeeds)
v
[Liveness Probe activates] -- detects deadlocks, restarts container
[Readiness Probe activates] -- gates Service traffic
|
v
(Pod serves traffic until termination)
Debugging Startup Probe Failures
When a container is stuck in a restart loop due to startup probe failures:
# Check events for probe failure messages
kubectl describe pod java-app
# Check container logs to see what's happening during startup
kubectl logs java-app -c app --previous
# Look for the specific reason
kubectl get pod java-app -o jsonpath='{.status.containerStatuses[0].state}'
Common causes of startup probe failures:
- Startup takes longer than the budget: Increase
failureThresholdorperiodSeconds. - Wrong port or path: Verify the health endpoint is correct.
- Missing dependencies: The app may be waiting for a database or config that is not available.
- Insufficient resources: The container may be OOM-killed or CPU-throttled during startup.
Startup Probes vs. Init Containers
These are complementary, not competing, features:
| Feature | Purpose | Timing | |---------|---------|--------| | Init containers | Run prerequisite tasks (migrations, config fetch) | Before app container starts | | Startup probes | Wait for the app container's process to initialize | After app container starts |
A typical flow might be: init container fetches config from Vault, then the app container starts, then the startup probe waits for the JVM to warm up.
Best Practices
- Always use startup probes for slow-starting applications instead of large
initialDelaySecondsvalues on liveness probes. - Set a generous failureThreshold -- it is better to wait a little longer than to kill a container that is still starting.
- Use the same endpoint for startup and liveness probes (
/healthz) since they answer the same question: "is the process functional?" - Use a different endpoint for readiness (
/ready) since readiness often has additional dependency checks. - Monitor startup duration using metrics or events to detect regressions in application initialization time.
Why Interviewers Ask This
Interviewers ask this to evaluate whether you understand Pod startup dynamics and can configure probes correctly for applications with long initialization times.
Common Follow-Up Questions
Key Takeaways
- Startup probes disable liveness and readiness probes until the application finishes initializing.
- They solve the problem of slow-starting applications being killed by liveness probes.
- Once the startup probe passes, it never runs again for that container.