How Does Session Affinity Work in Kubernetes Services?

intermediate|servicesdevopssreCKACKAD
TL;DR

Session affinity (also called sticky sessions) ensures that all requests from the same client IP are routed to the same backend Pod. Kubernetes supports ClientIP-based session affinity at the Service level, configurable via spec.sessionAffinity and spec.sessionAffinityConfig.

What Is Session Affinity?

Session affinity (commonly called sticky sessions) is a mechanism that ensures all requests from the same client are consistently routed to the same backend Pod. By default, Kubernetes Services distribute traffic randomly across all healthy Pods with no regard for client identity. Session affinity changes this behavior.

Kubernetes natively supports one form of session affinity: ClientIP. All requests originating from the same source IP address are pinned to the same backend Pod for a configurable duration.

Enabling Session Affinity

apiVersion: v1
kind: Service
metadata:
  name: web-app
spec:
  type: ClusterIP
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 1800    # 30 minutes
  selector:
    app: web-app
  ports:
    - port: 80
      targetPort: 8080

The two key fields:

| Field | Default | Description | |---|---|---| | sessionAffinity | None | Set to ClientIP to enable sticky sessions | | sessionAffinityConfig.clientIP.timeoutSeconds | 10800 (3 hours) | How long the affinity persists after the last request |

How It Works Under the Hood

iptables Mode

In iptables mode, kube-proxy adds --sticky rules with recent module matching:

# Conceptual iptables rules (simplified)
-A KUBE-SVC-XXX -m recent --name KUBE-SEP-AAA --rcheck --seconds 1800 --reap -j KUBE-SEP-AAA
-A KUBE-SVC-XXX -m statistic --mode random --probability 0.333 -j KUBE-SEP-AAA
-A KUBE-SVC-XXX -m statistic --mode random --probability 0.500 -j KUBE-SEP-BBB
-A KUBE-SVC-XXX -j KUBE-SEP-CCC

# When a match is made, mark the source IP
-A KUBE-SEP-AAA -m recent --name KUBE-SEP-AAA --set -j DNAT --to-destination 10.244.1.5:8080

The first request goes through the random probability chain. Subsequent requests from the same IP hit the --rcheck rule and skip directly to the previously chosen Pod.

IPVS Mode

In IPVS mode, kube-proxy configures the persistence flag on the virtual server:

ipvsadm -Ln
TCP  10.96.50.100:80 rr persistent 1800
  -> 10.244.1.5:8080       Masq    1      0      0
  -> 10.244.2.8:8080       Masq    1      0      0
  -> 10.244.3.12:8080      Masq    1      0      0

The persistent 1800 flag tells IPVS to keep sending the same client IP to the same backend for 1800 seconds.

Verifying Session Affinity

# Check the service configuration
kubectl get svc web-app -o yaml | grep -A 5 sessionAffinity

# Test from within the cluster -- all requests should hit the same Pod
kubectl run test-affinity --rm -it --image=busybox -- sh -c '
  for i in $(seq 1 10); do
    wget -qO- http://web-app/hostname
    echo
  done
'

With session affinity enabled, all 10 requests should return the same Pod hostname. Without it, responses would be distributed across Pods.

When to Use Session Affinity

Good Use Cases

# Application with in-memory session state
apiVersion: v1
kind: Service
metadata:
  name: legacy-web-app
spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 3600
  selector:
    app: legacy-web-app
  ports:
    - port: 80
      targetPort: 8080
  • Legacy applications that store session data in memory
  • WebSocket connections that must maintain server affinity
  • Applications performing multi-step transactions across multiple HTTP requests

When to Avoid It

  • Stateless microservices -- If your application stores session data in Redis or a database, affinity is unnecessary and limits load distribution.
  • Behind a NAT -- Many users behind a corporate NAT share the same source IP. Session affinity would send all of them to one Pod, creating a hotspot.
  • When using an Ingress controller -- Ingress controllers offer more sophisticated affinity (cookie-based), making Service-level affinity redundant.

Session Affinity with Different Service Types

Session affinity works with all Service types that use kube-proxy:

# NodePort with session affinity
apiVersion: v1
kind: Service
metadata:
  name: web-app
spec:
  type: NodePort
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 1800
  selector:
    app: web-app
  ports:
    - port: 80
      targetPort: 8080
      nodePort: 30080

However, with externalTrafficPolicy: Cluster on NodePort or LoadBalancer Services, the client's source IP is SNATed to the node's IP. This means all traffic through the same node appears to come from one IP, breaking meaningful session affinity. Use externalTrafficPolicy: Local to preserve the original client IP.

Cookie-Based Affinity via Ingress

For HTTP workloads, cookie-based affinity is more reliable because it works even when client IPs change (mobile networks, NAT):

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-app
  annotations:
    nginx.ingress.kubernetes.io/affinity: "cookie"
    nginx.ingress.kubernetes.io/affinity-mode: "persistent"
    nginx.ingress.kubernetes.io/session-cookie-name: "SERVERID"
    nginx.ingress.kubernetes.io/session-cookie-max-age: "3600"
spec:
  rules:
    - host: app.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: web-app
                port:
                  number: 80

This sets a SERVERID cookie that ties the browser to a specific backend Pod, regardless of client IP changes.

What Happens When the Pinned Pod Disappears?

If the Pod that a client is pinned to is terminated (scaling down, rolling update, crash):

  1. The Pod is removed from the EndpointSlice.
  2. kube-proxy removes the affinity entry.
  3. The client's next request goes through normal load balancing and is pinned to a new Pod.
  4. Any in-memory session state on the old Pod is lost.

This is why session affinity is not a substitute for proper session management. Applications should externalize session state whenever possible.

Summary

Kubernetes supports ClientIP-based session affinity at the Service level, ensuring requests from the same source IP go to the same Pod. It is implemented by kube-proxy in both iptables and IPVS modes. While useful for legacy applications with in-memory state, it has limitations with NAT environments and does not survive Pod restarts. For HTTP workloads, cookie-based affinity via an Ingress controller is more reliable and flexible.

Why Interviewers Ask This

Interviewers ask this to determine whether candidates understand how stateful interactions work at the networking level and when client-pinning is appropriate versus when applications should be designed for statelessness.

Common Follow-Up Questions

What is the default session affinity timeout and can it be changed?
The default timeout is 10800 seconds (3 hours). It can be changed using spec.sessionAffinityConfig.clientIP.timeoutSeconds, with a maximum of 86400 seconds (24 hours).
Does session affinity work with headless Services?
No. Headless Services return Pod IPs via DNS, and kube-proxy does not proxy traffic for them. Session affinity is a kube-proxy feature that only applies to Services with a ClusterIP.
Is cookie-based session affinity supported natively?
No. Kubernetes only supports ClientIP-based affinity natively. Cookie-based affinity requires an Ingress controller like NGINX Ingress, which can set and track session cookies at the HTTP layer.

Key Takeaways

  • Kubernetes supports only ClientIP-based session affinity at the Service level.
  • Session affinity is implemented by kube-proxy using iptables or IPVS rules.
  • For cookie-based or more advanced session affinity, use an Ingress controller.