How Do Health Checks Affect Kubernetes Deployments?

Q: How Do Health Checks Affect Kubernetes Deployments?

Liveness probes restart unhealthy containers, readiness probes control whether Pods receive traffic, and startup probes protect slow-starting containers. In the context of Deployments, readiness probes are critical because they gate rolling updates and prevent broken versions from receiving traffic.

Detailed Answer

Health checks (probes) are one of the most important configurations in a Kubernetes Deployment. They determine when Pods receive traffic, when containers are restarted, and how rolling updates progress. Misconfigured probes are one of the top causes of deployment-related outages.

The Three Probe Types

| Probe | Purpose | Failure Action | |---|---|---| | Liveness | Is the container running correctly? | Restart the container | | Readiness | Can the container serve traffic? | Remove Pod from Service endpoints | | Startup | Has the container finished initializing? | Keep checking; block liveness/readiness |

Complete Deployment with All Three Probes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
        - name: web-app
          image: web-app:2.0
          ports:
            - containerPort: 8080
          startupProbe:
            httpGet:
              path: /healthz
              port: 8080
            failureThreshold: 30
            periodSeconds: 10
            # App has up to 300 seconds to start
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 0
            periodSeconds: 5
            failureThreshold: 3
            successThreshold: 1
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 0
            periodSeconds: 10
            failureThreshold: 3
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: 500m
              memory: 256Mi

How Probes Interact with Rolling Updates

During a rolling update with maxUnavailable: 0, the sequence is:

A new Pod is created.
The startup probe runs until it succeeds (or the container is killed after exhausting failureThreshold).
Once the startup probe passes, the readiness probe starts.
When the readiness probe succeeds, the Pod is added to the Service endpoints.
Only after the new Pod is Ready does Kubernetes terminate an old Pod.
The process repeats for the next Pod.

Without a readiness probe, step 4 happens immediately when the container starts -- potentially sending traffic to an application that has not finished initialization.

Why Readiness Probes Are Critical for Deployments

Consider a Deployment without readiness probes:

# DANGEROUS: No readiness probe
spec:
  template:
    spec:
      containers:
        - name: web-app
          image: web-app:2.0
          ports:
            - containerPort: 8080
          # No probes defined!

During a rolling update:

A new Pod starts. Kubernetes immediately marks it as Ready.
The Service sends traffic to the new Pod.
The application inside the container is still loading (connecting to databases, warming caches).
Users get 502/503 errors.
Meanwhile, an old (healthy) Pod is terminated because the Deployment thinks the new Pod is serving traffic.

This creates a cascading failure during what should be a safe rollout.

Probe Types

HTTP Probe

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
    httpHeaders:
      - name: X-Health-Check
        value: "true"
  initialDelaySeconds: 5
  periodSeconds: 5

Returns healthy on any HTTP status 200-399.

TCP Probe

readinessProbe:
  tcpSocket:
    port: 5432
  initialDelaySeconds: 5
  periodSeconds: 10

Checks if a TCP connection can be established. Useful for databases.

Command Probe

livenessProbe:
  exec:
    command:
      - /bin/sh
      - -c
      - pg_isready -U postgres
  initialDelaySeconds: 30
  periodSeconds: 10

Runs a command inside the container. Exit code 0 means healthy.

gRPC Probe

readinessProbe:
  grpc:
    port: 50051
    service: ""
  initialDelaySeconds: 5
  periodSeconds: 10

Uses the gRPC health checking protocol. Available in Kubernetes 1.27+.

Startup Probes for Slow-Starting Apps

Some applications take minutes to start (Java applications, ML model loading, cache warming). Without a startup probe, the liveness probe might kill the container before it finishes initializing:

# Problem: App takes 2 minutes to start, liveness probe kills it at 30 seconds
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 30    # Not enough!
  periodSeconds: 10
  failureThreshold: 3

The fix is a startup probe with a generous timeout:

startupProbe:
  httpGet:
    path: /healthz
    port: 8080
  failureThreshold: 30       # 30 * 10 = 300 seconds to start
  periodSeconds: 10

livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  periodSeconds: 10           # No initialDelaySeconds needed
  failureThreshold: 3

The liveness probe does not run until the startup probe succeeds.

Tuning Probe Parameters

| Parameter | Description | Default | |---|---|---| | initialDelaySeconds | Seconds to wait before first probe | 0 | | periodSeconds | How often to run the probe | 10 | | timeoutSeconds | Seconds before the probe times out | 1 | | successThreshold | Consecutive successes to be considered healthy | 1 | | failureThreshold | Consecutive failures to be considered unhealthy | 3 |

Common Mistakes

Using the same endpoint for liveness and readiness. A readiness endpoint might return unhealthy during high load (to shed traffic), but you do not want the liveness probe to restart the container for that.
Setting initialDelaySeconds too low. The container starts receiving health checks before it is ready. Use a startup probe instead.
Setting timeoutSeconds too low. If your health endpoint queries a database, it might time out under load, causing false failures.
Not using readiness probes at all. This is the single biggest cause of deployment-related outages.

Best Practices

# Recommended probe configuration for web applications
startupProbe:
  httpGet:
    path: /healthz
    port: 8080
  failureThreshold: 30
  periodSeconds: 10
readinessProbe:
  httpGet:
    path: /ready       # Separate endpoint from liveness
    port: 8080
  periodSeconds: 5
  failureThreshold: 3
  successThreshold: 1
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  periodSeconds: 10
  failureThreshold: 5    # More tolerant than readiness

Summary

Health checks are the mechanism Kubernetes uses to determine Pod health during rolling updates. Readiness probes are the most important -- they gate traffic to new Pods and prevent old Pods from being terminated prematurely. Startup probes handle slow-starting applications. Liveness probes recover from deadlocks and hangs. Configuring all three probes correctly is essential for safe, zero-downtime deployments.