What Load Balancing Algorithms Does Kubernetes Use?

Q: What Load Balancing Algorithms Does Kubernetes Use?

Kubernetes Services use random selection (iptables mode) or round-robin (IPVS mode) by default. For more sophisticated load balancing — least connections, weighted, or consistent hashing — you need IPVS mode, a service mesh, or an Ingress controller.

Detailed Answer

Kubernetes load balancing works at multiple layers, and the algorithm depends on which component is doing the balancing. Understanding these layers helps you choose the right approach for your workload.

Layer 4: kube-proxy Load Balancing

kube-proxy implements Service load balancing using either iptables or IPVS.

iptables Mode (Default)

In iptables mode, kube-proxy creates probability-based iptables rules:

# Simplified iptables chain for a Service with 3 endpoints
-A KUBE-SVC-XXX -m statistic --mode random --probability 0.333 -j KUBE-SEP-AAA
-A KUBE-SVC-XXX -m statistic --mode random --probability 0.500 -j KUBE-SEP-BBB
-A KUBE-SVC-XXX -j KUBE-SEP-CCC

The first rule matches with 1/3 probability, the second with 1/2 of remaining (= 1/3 total), and the third catches everything else. This gives approximately equal distribution but is not round-robin — it is random per connection.

Limitations:

No awareness of endpoint health or load
Cannot do least-connections or weighted balancing
Performance degrades with thousands of Services (linear iptables chain walking)

IPVS Mode

IPVS uses hash tables for O(1) lookup and supports multiple scheduling algorithms:

# Enable IPVS mode in kube-proxy config
kubectl edit configmap kube-proxy -n kube-system

apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
  scheduler: "lc"  # least connections

| Algorithm | Flag | Behavior | |-----------|------|----------| | Round Robin | rr | Rotate through endpoints sequentially | | Least Connections | lc | Send to endpoint with fewest active connections | | Destination Hashing | dh | Hash destination IP for consistent routing | | Source Hashing | sh | Hash source IP for session persistence | | Shortest Expected Delay | sed | Consider both connections and weight | | Never Queue | nq | Send to idle server, then use SED |

Choosing Least Connections

Least connections is ideal for services with variable request processing times:

# kube-proxy ConfigMap
mode: "ipvs"
ipvs:
  scheduler: "lc"

A Pod processing a slow database query keeps its connection count high, so new requests go to less-busy Pods.

Session Affinity

Both iptables and IPVS support session affinity at the Service level:

apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  selector:
    app: web
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 3600
  ports:
    - port: 80
      targetPort: 8080

All requests from the same client IP go to the same Pod for the specified timeout period. This overrides the default algorithm.

Layer 7: Ingress Controller Load Balancing

Ingress controllers provide HTTP-aware load balancing with more algorithms:

Nginx Ingress Controller

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web
  annotations:
    nginx.ingress.kubernetes.io/upstream-hash-by: "$request_uri"
    # Or: nginx.ingress.kubernetes.io/load-balance: "ewma"
spec:
  ingressClassName: nginx
  rules:
    - host: app.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: web
                port:
                  number: 80

Nginx supports: round-robin (default), least connections (least_conn), IP hash (ip_hash), consistent hash (upstream-hash-by), and EWMA (exponentially weighted moving average).

Layer 7: Service Mesh Load Balancing

Service meshes like Istio provide the most sophisticated options:

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: api-lb
spec:
  host: api-server
  trafficPolicy:
    loadBalancer:
      simple: LEAST_REQUEST
      # Options: ROUND_ROBIN, LEAST_REQUEST, RANDOM, PASSTHROUGH
      # Or consistent hashing:
      # consistentHash:
      #   httpHeaderName: "x-user-id"

Istio's Envoy proxies support:

Round-robin
Least requests (similar to least connections but counts pending requests)
Random
Consistent hash (by header, cookie, source IP, or query parameter)
Ring hash for cache-friendly distribution

Comparison Table

| Layer | Component | Algorithms | Use Case | |-------|-----------|-----------|----------| | L4 | iptables | Random | Simple, low-overhead | | L4 | IPVS | RR, LC, SH, DH, SED, NQ | Performance-sensitive L4 | | L7 | Nginx Ingress | RR, LC, hash, EWMA | HTTP traffic | | L7 | Istio/Envoy | RR, least request, hash | Microservices with observability |

Choosing the Right Algorithm

| Workload Pattern | Recommended Algorithm | |-----------------|-----------------------| | Uniform request latency | Round-robin | | Variable processing time | Least connections | | Stateful sessions | Source hash or session affinity | | Cache optimization | Consistent hash by URL | | Mixed traffic | EWMA (adapts to actual latency) |

Monitoring Load Distribution

# Check connection counts per Pod (IPVS)
ipvsadm -Ln --stats

# Watch per-Pod request rates (requires metrics)
kubectl top pods -l app=web

# In Istio — check Envoy stats
kubectl exec <pod> -c istio-proxy -- pilot-agent request GET stats | grep upstream_rq