How Does Leader Election Work in Kubernetes?

Q: How Does Leader Election Work in Kubernetes?

Leader election in Kubernetes uses lease objects to ensure only one instance of a controller or application actively performs work at a time. Other instances remain on standby and take over if the leader fails.

Detailed Answer

Leader election is a coordination pattern where multiple replicas of a process agree that exactly one of them — the leader — actively does work. The others remain on standby and take over if the leader fails. Kubernetes uses this pattern internally and provides primitives for your own controllers to use it.

Why Leader Election Matters

Without leader election, running multiple replicas of a controller could result in:

Duplicate work: Two instances both try to scale a Deployment
Conflicting actions: One instance tries to create a resource while another deletes it
Data corruption: Two instances simultaneously update the same object

How the Kubernetes Control Plane Uses It

In an HA cluster with 3 control plane nodes:

kube-controller-manager: Only one instance runs reconciliation loops; the other two are standby
kube-scheduler: Only one instance makes scheduling decisions

You can see the current leader by inspecting the Lease object:

# Check who is the current controller-manager leader
kubectl get lease kube-controller-manager -n kube-system -o yaml

# Check who is the current scheduler leader
kubectl get lease kube-scheduler -n kube-system -o yaml

Lease Object Structure

apiVersion: coordination.k8s.io/v1
kind: Lease
metadata:
  name: my-controller
  namespace: my-system
spec:
  holderIdentity: "controller-pod-abc123"
  leaseDurationSeconds: 15
  acquireTime: "2026-03-19T10:00:00Z"
  renewTime: "2026-03-19T10:05:30Z"
  leaseTransitions: 3

Key fields:

holderIdentity: The current leader's identifier
leaseDurationSeconds: How long the lease is valid without renewal
renewTime: When the leader last renewed the lease
leaseTransitions: How many times leadership has changed

Implementing Leader Election in Go

The client-go library provides a built-in leader election package:

package main

import (
    "context"
    "os"
    "time"

    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/client-go/kubernetes"
    "k8s.io/client-go/tools/leaderelection"
    "k8s.io/client-go/tools/leaderelection/resourcelock"
)

func main() {
    clientset := getKubernetesClient()
    hostname, _ := os.Hostname()

    lock := &resourcelock.LeaseLock{
        LeaseMeta: metav1.ObjectMeta{
            Name:      "my-controller",
            Namespace: "my-system",
        },
        Client: clientset.CoordinationV1(),
        LockConfig: resourcelock.ResourceLockConfig{
            Identity: hostname,
        },
    }

    leaderelection.RunOrDie(context.TODO(),
        leaderelection.LeaderElectionConfig{
            Lock:            lock,
            LeaseDuration:   15 * time.Second,
            RenewDeadline:   10 * time.Second,
            RetryPeriod:     2 * time.Second,
            Callbacks: leaderelection.LeaderCallbacks{
                OnStartedLeading: func(ctx context.Context) {
                    // Start doing work
                    runController(ctx)
                },
                OnStoppedLeading: func() {
                    // Clean up and exit
                    os.Exit(0)
                },
                OnNewLeader: func(identity string) {
                    log.Printf("New leader: %s", identity)
                },
            },
        })
}

Configuration Parameters

| Parameter | Default | Purpose | |-----------|---------|---------| | leaseDuration | 15s | How long a lease is valid | | renewDeadline | 10s | How long the leader tries to renew before giving up | | retryPeriod | 2s | How often non-leaders check if they can acquire the lease |

The invariant leaseDuration > renewDeadline > retryPeriod must hold.

Failover Timing

When the leader crashes:

The leader stops renewing the lease
The lease expires after leaseDuration (15s by default)
A standby instance detects the expired lease on its next retryPeriod check (up to 2s)
The standby acquires the lease and starts leading

Maximum failover time = leaseDuration + retryPeriod = 17 seconds (with defaults).

RBAC for Leader Election

The controller's ServiceAccount needs permissions to create and update Lease objects:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: leader-election-role
  namespace: my-system
rules:
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: leader-election-binding
  namespace: my-system
subjects:
  - kind: ServiceAccount
    name: my-controller
    namespace: my-system
roleRef:
  kind: Role
  name: leader-election-role
  apiGroup: rbac.authorization.k8s.io

Common Pitfalls

Clock skew: If node clocks are significantly out of sync, lease expiration can behave unexpectedly. Use NTP.
Too-short lease duration: In a high-latency environment, the leader may fail to renew in time, causing unnecessary failovers.
Not exiting on leadership loss: If OnStoppedLeading does not exit or stop work, you get a split-brain scenario.
Using ConfigMaps or Endpoints: These work but generate unnecessary watch events for all clients. Leases are purpose-built and lighter.

Monitoring Leader Election

# Watch lease transitions
kubectl get lease my-controller -n my-system -w

# Check for frequent transitions (indicates instability)
kubectl get lease my-controller -n my-system \
  -o jsonpath='{.spec.leaseTransitions}'

Detailed Answer

Why Leader Election Matters

How the Kubernetes Control Plane Uses It

Lease Object Structure

Implementing Leader Election in Go

Configuration Parameters

Failover Timing

RBAC for Leader Election

Common Pitfalls

Monitoring Leader Election

Why Interviewers Ask This

Common Follow-Up Questions

Key Takeaways

Related Questions

You Might Also Like