What Are EndpointSlices and Why Did They Replace Endpoints?

advanced|servicesdevopssreCKACKAD
TL;DR

EndpointSlices are a scalable replacement for the legacy Endpoints resource. They split endpoint information into smaller, bounded chunks (default 100 endpoints per slice), reducing the API server and network load when Services back large numbers of Pods.

What Are EndpointSlices?

EndpointSlices are Kubernetes resources that track the IP addresses, ports, and readiness states of Pods backing a Service. They were introduced to replace the legacy Endpoints resource, which had significant scalability problems in large clusters.

An EndpointSlice contains a bounded number of endpoints (default maximum of 100 per slice). A Service with 500 matching Pods would have 5 EndpointSlice objects instead of one massive Endpoints object.

The Problem with Legacy Endpoints

With the legacy Endpoints resource, a single object contained every backend Pod for a Service:

# Legacy Endpoints object for a Service with 3 Pods
apiVersion: v1
kind: Endpoints
metadata:
  name: my-service
subsets:
  - addresses:
      - ip: 10.244.1.5
        targetRef:
          kind: Pod
          name: my-app-abc12
      - ip: 10.244.2.8
        targetRef:
          kind: Pod
          name: my-app-def34
      - ip: 10.244.3.12
        targetRef:
          kind: Pod
          name: my-app-ghi56
    ports:
      - port: 8080
        protocol: TCP

For a Service backing 5,000 Pods, this single object becomes enormous. The critical problem: when any single Pod is added or removed, the entire Endpoints object must be rewritten and transmitted to every node via kube-proxy's watch. In a cluster with 500 nodes, this means broadcasting megabytes of data for a single Pod change.

Legacy Endpoints scaling problem:

1 Pod changes  -->  Entire Endpoints object updated
                -->  Sent to all N nodes via watch
                -->  Each node re-programs all rules

With 5,000 endpoints and 500 nodes:
  Update size: ~500KB per update
  Total bandwidth: ~250MB per single Pod change

How EndpointSlices Solve This

EndpointSlices split the endpoint data into multiple smaller objects:

apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: my-service-abc12
  labels:
    kubernetes.io/service-name: my-service
  ownerReferences:
    - apiVersion: v1
      kind: Service
      name: my-service
addressType: IPv4
endpoints:
  - addresses:
      - "10.244.1.5"
    conditions:
      ready: true
      serving: true
      terminating: false
    targetRef:
      kind: Pod
      name: my-app-abc12
      namespace: default
    nodeName: worker-1
    zone: us-east-1a
  - addresses:
      - "10.244.2.8"
    conditions:
      ready: true
      serving: true
      terminating: false
    targetRef:
      kind: Pod
      name: my-app-def34
      namespace: default
    nodeName: worker-2
    zone: us-east-1b
ports:
  - name: http
    port: 8080
    protocol: TCP

When one Pod changes, only the slice containing that Pod is updated:

EndpointSlice scaling improvement:

1 Pod changes  -->  Only 1 slice updated (max 100 endpoints)
               -->  Sent to all N nodes via watch
               -->  Each node re-programs only affected rules

With 5,000 endpoints and 500 nodes:
  Slices: 50 (100 endpoints each)
  Update size: ~10KB per update (1 slice)
  Total bandwidth: ~5MB per single Pod change (50x improvement)

EndpointSlice Structure

Each EndpointSlice contains richer information than the legacy Endpoints:

| Field | Description | |---|---| | addressType | IPv4, IPv6, or FQDN | | endpoints[].addresses | List of IP addresses | | endpoints[].conditions.ready | Pod has passed readiness checks | | endpoints[].conditions.serving | Pod is serving (even during termination) | | endpoints[].conditions.terminating | Pod is in the process of terminating | | endpoints[].nodeName | Node where the Pod is running | | endpoints[].zone | Availability zone of the Pod | | ports | Port definitions shared by all endpoints in the slice |

The serving and terminating conditions enable graceful connection draining that was not possible with legacy Endpoints.

Viewing EndpointSlices

# List all EndpointSlices for a Service
kubectl get endpointslices -l kubernetes.io/service-name=my-service

# Detailed view
kubectl describe endpointslice my-service-abc12

# JSON output to see all fields
kubectl get endpointslice my-service-abc12 -o yaml

Example output:

NAME               ADDRESSTYPE   PORTS   ENDPOINTS                     AGE
my-service-abc12   IPv4          8080    10.244.1.5,10.244.2.8,...     5m
my-service-def34   IPv4          8080    10.244.4.20,10.244.5.3,...    5m

Topology-Aware Routing

EndpointSlices include topology information (nodeName, zone) that enables topology-aware routing. When enabled, kube-proxy prefers endpoints in the same zone as the client Pod, reducing cross-zone traffic and latency:

apiVersion: v1
kind: Service
metadata:
  name: my-service
  annotations:
    service.kubernetes.io/topology-mode: Auto
spec:
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 8080

When topology-mode: Auto is set:

Client Pod (zone: us-east-1a)
   │
   │  kube-proxy checks EndpointSlice zones
   │  Prefers endpoints in us-east-1a
   │
   ▼
Pod in us-east-1a (preferred)
   instead of
Pod in us-east-1b (cross-zone)

This can significantly reduce cloud data transfer costs and latency.

Dual-Stack Support

EndpointSlices natively support dual-stack networking. A Service can have separate EndpointSlice objects for IPv4 and IPv6:

my-service-ipv4-abc12   addressType: IPv4   10.244.1.5, 10.244.2.8
my-service-ipv6-def34   addressType: IPv6   fd00::1:5, fd00::2:8

This was not possible with the legacy Endpoints resource, which had no concept of address type.

Manual EndpointSlices (Services Without Selectors)

For Services without a selector, you can create EndpointSlices manually:

apiVersion: v1
kind: Service
metadata:
  name: external-service
spec:
  ports:
    - port: 443
      targetPort: 443
---
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: external-service-manual
  labels:
    kubernetes.io/service-name: external-service
addressType: IPv4
endpoints:
  - addresses:
      - "203.0.113.50"
    conditions:
      ready: true
  - addresses:
      - "203.0.113.51"
    conditions:
      ready: true
ports:
  - port: 443
    protocol: TCP

This is the modern way to point a Kubernetes Service at external IP addresses.

EndpointSlice Limits and Configuration

The maximum number of endpoints per slice is configurable on the EndpointSlice controller:

# Default is 100 endpoints per slice
kube-controller-manager --max-endpoints-per-slice=100

Reducing this number creates more slices but further reduces the blast radius of each update. Increasing it reduces the number of API objects but increases per-update size.

Migrating from Endpoints to EndpointSlices

EndpointSlices are the default since Kubernetes 1.21, and kube-proxy consumes them by default since 1.22+. The legacy Endpoints resource still exists for backward compatibility, but the control plane primarily manages EndpointSlices.

No explicit migration is required. Both resources are maintained in parallel, with the EndpointSlice controller being the source of truth.

Summary

EndpointSlices solved a real scalability bottleneck in Kubernetes by breaking monolithic Endpoints objects into bounded, independently updatable chunks. They also added support for dual-stack networking, topology-aware routing, and graceful termination tracking. In modern Kubernetes, EndpointSlices are the primary mechanism through which kube-proxy learns about Service backends, and understanding them is important for operating and debugging large-scale clusters.

Why Interviewers Ask This

Interviewers ask this to evaluate knowledge of Kubernetes internals and scalability. Understanding EndpointSlices shows a candidate is familiar with how large clusters handle service routing efficiently.

Common Follow-Up Questions

What was the scalability problem with the legacy Endpoints resource?
A single Endpoints object contained every Pod IP for a Service. With thousands of Pods, this object became very large. Every update (e.g., one Pod restarting) required transmitting the entire object to every node, consuming significant API server bandwidth and etcd storage.
How does the EndpointSlice controller decide how to distribute endpoints across slices?
The controller fills slices up to the maximum size (default 100) and distributes endpoints with topology hints when enabled. It tries to minimize churn by only updating slices that changed.
Do you need to create EndpointSlices manually?
No. The EndpointSlice controller automatically creates and manages them for any Service with a selector. You only manage them manually for Services without a selector.

Key Takeaways

  • EndpointSlices break endpoint data into bounded chunks, improving scalability for large Services.
  • They reduce API server bandwidth by only transmitting the slice that changed, not all endpoints.
  • EndpointSlices support dual-stack (IPv4 and IPv6) and topology-aware routing natively.