What Is externalTrafficPolicy in Kubernetes and How Does It Affect Routing?

advanced|servicesdevopssreCKACKAD
TL;DR

externalTrafficPolicy controls how a NodePort or LoadBalancer Service routes external traffic. The Cluster policy (default) distributes traffic to Pods on any node but loses the client source IP. The Local policy only sends traffic to Pods on the receiving node, preserving the client IP but potentially causing uneven load distribution.

What Is externalTrafficPolicy?

externalTrafficPolicy is a field on NodePort and LoadBalancer Services that controls how external traffic is routed once it arrives at a cluster node. It has two possible values:

  • Cluster (default) -- Traffic can be forwarded to Pods on any node in the cluster.
  • Local -- Traffic is only forwarded to Pods running on the node that received the traffic.

This setting has significant implications for client IP preservation, load distribution, and network latency.

Cluster Policy (Default)

apiVersion: v1
kind: Service
metadata:
  name: web-app
spec:
  type: LoadBalancer
  externalTrafficPolicy: Cluster    # default
  selector:
    app: web-app
  ports:
    - port: 80
      targetPort: 8080

Traffic Flow

Client (203.0.113.10)
      │
      ▼
┌─────────────┐
│  Cloud LB   │
└──────┬──────┘
       │  Sends to any healthy node
       ▼
┌─────────────┐     SNAT: src becomes Node A IP
│  Node A     │ ──────────────────────────────────> │  Node B     │
│  (no local  │     Traffic forwarded to Pod on     │  (has Pod)  │
│   Pod)      │     a different node                │             │
└─────────────┘                                     └─────────────┘
                                                          │
                                                          ▼
                                                    ┌───────────┐
                                                    │ Pod       │
                                                    │ sees src: │
                                                    │ Node A IP │
                                                    └───────────┘

Key behavior:

  1. The load balancer sends traffic to any node (all nodes pass health checks).
  2. If the receiving node has no local Pod, kube-proxy forwards to a Pod on another node.
  3. To ensure the return path works, kube-proxy performs SNAT -- the source IP is rewritten to the forwarding node's IP.
  4. The Pod sees the node's IP as the client, not the original client IP.

Pros and Cons

| Pros | Cons | |---|---| | Even traffic distribution across all Pods | Client source IP is lost (SNAT) | | All nodes are eligible to receive traffic | Extra network hop for cross-node routing | | Resilient -- any node can handle any request | Slightly higher latency |

Local Policy

apiVersion: v1
kind: Service
metadata:
  name: web-app
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local
  selector:
    app: web-app
  ports:
    - port: 80
      targetPort: 8080

Traffic Flow

Client (203.0.113.10)
      │
      ▼
┌─────────────┐
│  Cloud LB   │
└──────┬──────┘
       │  Health checks filter to nodes with Pods
       │
       │  Node A: no local Pod → healthcheck returns 503 → SKIPPED
       │  Node B: has local Pod → healthcheck returns 200 → SELECTED
       │
       ▼
┌─────────────┐
│  Node B     │
│  (has Pod)  │
└──────┬──────┘
       │  No SNAT needed (no cross-node hop)
       ▼
┌─────────────┐
│ Pod         │
│ sees src:   │
│ 203.0.113.10│ ← Original client IP preserved!
└─────────────┘

The Health Check Node Port

When externalTrafficPolicy: Local is set, kube-proxy opens a special HTTP health check port on every node:

kubectl get svc web-app -o yaml | grep healthCheckNodePort
healthCheckNodePort: 31234

The cloud load balancer probes this port:

# Node with matching Pods
curl http://node-b:31234/healthz
# Returns 200 with body: {"localEndpoints": 2, "serviceProxyHealthy": true}

# Node without matching Pods
curl http://node-a:31234/healthz
# Returns 503 with body: {"localEndpoints": 0, "serviceProxyHealthy": true}

The LB only sends traffic to nodes that return 200.

Pros and Cons

| Pros | Cons | |---|---| | Client source IP is preserved | Uneven load if Pods are not spread evenly | | No extra network hop | Nodes without Pods receive zero traffic | | Lower latency | Requires LB health check support |

The Uneven Distribution Problem

Consider a cluster with 3 nodes and 2 Pods using externalTrafficPolicy: Local:

Node A: 1 Pod    ← receives 50% of traffic → 1 Pod handles 50%
Node B: 1 Pod    ← receives 50% of traffic → 1 Pod handles 50%
Node C: 0 Pods   ← receives 0% of traffic

Even from the LB's perspective, but fine if only 2 Pods.

Now consider 3 nodes with 4 Pods unevenly distributed:

Node A: 3 Pods   ← receives 33% of traffic → each Pod gets ~11%
Node B: 1 Pod    ← receives 33% of traffic → 1 Pod gets 33%
Node C: 0 Pods   ← receives 0% of traffic

The single Pod on Node B gets 3x the load of Pods on Node A!

Mitigation Strategies

  1. Use Pod topology spread constraints to distribute Pods evenly across nodes:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 6
  template:
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: web-app
      containers:
        - name: web
          image: myapp:1.0
  1. Set replicas equal to the number of nodes so each node runs exactly one Pod (using a DaemonSet-like pattern).

  2. Use topology-aware routing (EndpointSlice hints) for zone-level balancing.

Practical Comparison

# Same Service, two configurations
---
# Configuration 1: Cluster policy
apiVersion: v1
kind: Service
metadata:
  name: web-app-cluster
spec:
  type: LoadBalancer
  externalTrafficPolicy: Cluster
  selector:
    app: web-app
  ports:
    - port: 80
      targetPort: 8080
---
# Configuration 2: Local policy
apiVersion: v1
kind: Service
metadata:
  name: web-app-local
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local
  selector:
    app: web-app
  ports:
    - port: 80
      targetPort: 8080

Testing source IP visibility from the Pod:

# With Cluster policy
kubectl logs web-app-pod | grep "client_ip"
# client_ip=10.244.0.1  (node IP, not the real client)

# With Local policy
kubectl logs web-app-pod | grep "client_ip"
# client_ip=203.0.113.10  (real client IP)

Decision Guide

| Requirement | Recommended Policy | |---|---| | Need client source IP for logging, rate limiting, geo-routing | Local | | Need even distribution with minimal configuration | Cluster | | Running a WAF or security tool that needs real client IP | Local | | Small number of Pods relative to nodes | Cluster | | Using topology spread constraints | Local |

Summary

externalTrafficPolicy is a critical setting for production Services that receive external traffic. The Cluster policy provides simplicity and even distribution at the cost of losing client IPs. The Local policy preserves client IPs and reduces latency but requires careful Pod placement to avoid uneven load. Choosing the right policy depends on your application's requirements for source IP visibility, load balance fairness, and network topology.

Why Interviewers Ask This

This is a nuanced networking topic that separates operators who have dealt with production traffic issues from those who have not. It touches on source IP preservation, load balancing fairness, and health check behavior -- all critical for real-world services.

Common Follow-Up Questions

Why does the Cluster policy lose the client source IP?
When kube-proxy forwards traffic to a Pod on a different node, it performs SNAT (source NAT) to ensure the return packet comes back through the same node. This rewrites the source IP to the node's IP.
What is the health check node port and how does it work?
When externalTrafficPolicy is Local, kube-proxy exposes a healthCheckNodePort that returns 200 if there are local Pods and 503 if not. The cloud load balancer uses this to avoid sending traffic to nodes without relevant Pods.
Can you use externalTrafficPolicy with ClusterIP Services?
No. externalTrafficPolicy only applies to NodePort and LoadBalancer Services because it governs how externally-originating traffic is handled.

Key Takeaways

  • Cluster policy provides even distribution but loses client IP; Local policy preserves client IP but may distribute unevenly.
  • Local policy uses health check node ports to prevent traffic to nodes without matching Pods.
  • The choice between Cluster and Local depends on whether source IP preservation or even load distribution is more important.