What does kube-proxy do and how does it implement Service networking?

beginner|architecturedevopssrecloud architectCKA
TL;DR

kube-proxy is a network component that runs on every node and implements Kubernetes Service abstraction by maintaining network rules that route traffic to the correct backend pods. It supports iptables and IPVS modes for packet forwarding and load balancing.

Detailed Answer

kube-proxy is a network component that runs on every node in the cluster, typically deployed as a DaemonSet. Its primary job is to implement the Service abstraction by programming the node's networking stack to correctly route traffic destined for a Service's ClusterIP to one of the healthy backend pods.

How Services Work

When you create a Kubernetes Service, it gets assigned a virtual IP (ClusterIP) from the service CIDR range. This IP does not correspond to any network interface; it exists only as a destination in routing rules. kube-proxy watches the API server for Service and EndpointSlice objects and configures the data plane accordingly.

apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  selector:
    app: web
  ports:
  - port: 80
    targetPort: 8080
  type: ClusterIP

When a pod sends traffic to web-service:80 (resolved to the ClusterIP, say 10.96.45.12), kube-proxy's rules intercept the packet and redirect it to one of the backend pod IPs on port 8080.

Proxy Modes

iptables mode (default) -- kube-proxy creates iptables rules for each Service and each backend endpoint. Traffic matching a Service ClusterIP is DNAT'd to a randomly selected backend pod:

# View iptables rules created by kube-proxy for a specific service
iptables -t nat -L KUBE-SERVICES -n | grep web-service

# Detailed view of the chain for the service
iptables -t nat -L KUBE-SVC-XXXXXX -n
# Shows probability-based rules for load balancing across backends:
# -A KUBE-SVC-XXXXXX -m statistic --mode random --probability 0.333 -j KUBE-SEP-AAA
# -A KUBE-SVC-XXXXXX -m statistic --mode random --probability 0.500 -j KUBE-SEP-BBB
# -A KUBE-SVC-XXXXXX -j KUBE-SEP-CCC

IPVS mode -- Uses Linux IPVS (IP Virtual Server) kernel module, which is a transport-layer load balancer built into the kernel. It provides better performance for large clusters:

# Enable IPVS mode in kube-proxy configuration
kubectl edit configmap kube-proxy -n kube-system
# Change mode: "" to mode: "ipvs"

# View IPVS rules
ipvsadm -Ln
# TCP  10.96.45.12:80 rr
#   -> 10.244.1.5:8080    Masq    1      0          0
#   -> 10.244.2.8:8080    Masq    1      0          0
#   -> 10.244.3.3:8080    Masq    1      0          0

IPVS supports multiple load-balancing algorithms:

  • rr -- Round Robin (default)
  • lc -- Least Connections
  • dh -- Destination Hashing
  • sh -- Source Hashing
  • sed -- Shortest Expected Delay
  • nq -- Never Queue

kube-proxy Configuration

kube-proxy is configured via a ConfigMap in the kube-system namespace:

# View the full kube-proxy configuration
kubectl get configmap kube-proxy -n kube-system -o yaml

# Check kube-proxy mode currently in use
kubectl logs -n kube-system -l k8s-app=kube-proxy | grep "Using"
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
  scheduler: "rr"
  syncPeriod: "30s"
iptables:
  syncPeriod: "30s"
  minSyncPeriod: "1s"
clusterCIDR: "10.244.0.0/16"

NodePort and LoadBalancer Services

kube-proxy also handles NodePort and LoadBalancer Services:

# For a NodePort Service, kube-proxy creates rules to accept traffic
# on the allocated NodePort on every node
kubectl expose deployment web --type=NodePort --port=80 --target-port=8080

# View the allocated NodePort
kubectl get svc web -o jsonpath='{.spec.ports[0].nodePort}'
# e.g., 31234

# Traffic to ANY_NODE_IP:31234 is routed to backend pods
# regardless of which node the pods are actually running on

Troubleshooting kube-proxy

# Check kube-proxy pods are running
kubectl get pods -n kube-system -l k8s-app=kube-proxy

# View kube-proxy logs
kubectl logs -n kube-system daemonset/kube-proxy

# Verify endpoints exist for a Service
kubectl get endpointslices -l kubernetes.io/service-name=web-service

# Test Service DNS resolution from within the cluster
kubectl run debug --image=busybox --rm -it --restart=Never -- \
  nslookup web-service.default.svc.cluster.local

# Test connectivity to a Service
kubectl run debug --image=busybox --rm -it --restart=Never -- \
  wget -qO- http://web-service.default.svc.cluster.local

eBPF-Based Alternatives

Modern CNI plugins like Cilium can completely replace kube-proxy by implementing Service routing using eBPF programs attached directly to network interfaces. Benefits include lower latency, better observability, and elimination of iptables/IPVS rule sprawl. To run without kube-proxy, you start the cluster with --skip-kube-proxy during kubeadm init.

Why Interviewers Ask This

This question probes a candidate's understanding of Kubernetes networking fundamentals. Knowing how Service traffic reaches pods reveals awareness of the networking layer that is critical for debugging connectivity issues, optimizing performance, and understanding security boundaries.

Common Follow-Up Questions

What is the difference between iptables mode and IPVS mode?
iptables mode uses kernel iptables rules with O(n) lookup per connection. IPVS mode uses kernel-level load balancing with O(1) lookup using hash tables, making it better suited for clusters with thousands of Services.
Can you run a cluster without kube-proxy?
Yes. Some CNI plugins like Cilium can replace kube-proxy entirely by implementing Service routing using eBPF, which provides better performance and observability.
How does kube-proxy handle session affinity?
When sessionAffinity is set to ClientIP on a Service, kube-proxy configures rules to route all requests from the same client IP to the same backend pod for a configurable timeout period.

Key Takeaways

  • kube-proxy translates Service virtual IPs into actual pod IPs using iptables or IPVS rules
  • It does not proxy traffic itself in modern mode; it programs kernel-level rules for the data path
  • IPVS mode is preferred for large clusters due to its O(1) connection routing performance