What Is KEDA and How Does Event-Driven Autoscaling Work?

advanced|autoscalingdevopssrebackend developerCKACKAD
TL;DR

KEDA (Kubernetes Event-Driven Autoscaling) extends the HPA to scale workloads based on event sources like message queues, databases, and custom metrics. It can scale to zero Pods when there is no work, unlike the standard HPA.

Detailed Answer

KEDA (Kubernetes Event-Driven Autoscaling) is a CNCF graduated project that allows Kubernetes to scale workloads based on the number of events in external systems — message queues, streams, databases, and more.

Why Standard HPA Is Not Enough

The HPA scales based on observed CPU, memory, or custom metrics. For event-driven workloads, this is inadequate:

Standard HPA:
  Queue has 10,000 messages → Pods at 10% CPU → HPA does NOT scale up

KEDA:
  Queue has 10,000 messages → KEDA scales to 50 Pods to drain the queue
  Queue empty → KEDA scales to 0 Pods

Architecture

External Source → KEDA Scaler → Metrics Adapter → HPA → Deployment
(Kafka, SQS,      (polls        (exposes as        (scales
 RabbitMQ)         metrics)      external metric)    replicas)

KEDA creates and manages an HPA behind the scenes, feeding it external metrics.

Installation

helm repo add kedacore https://kedacore.github.io/charts
helm install keda kedacore/keda --namespace keda --create-namespace

ScaledObject: The Core Resource

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: order-processor
  namespace: production
spec:
  scaleTargetRef:
    name: order-processor     # Deployment name
  pollingInterval: 15          # Check every 15 seconds
  cooldownPeriod: 60           # Wait before scaling to zero
  minReplicaCount: 0           # Scale to zero when idle
  maxReplicaCount: 100         # Maximum replicas
  triggers:
    - type: rabbitmq
      metadata:
        queueName: orders
        host: amqp://rabbitmq.production:5672
        queueLength: "10"     # Target: 10 messages per Pod

This scales the order-processor Deployment based on the RabbitMQ queue length. With 500 messages and a target of 10 per Pod, KEDA scales to 50 replicas.

Common Scaler Examples

Kafka

triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka.production:9092
      consumerGroup: my-group
      topic: events
      lagThreshold: "100"    # Scale when lag > 100 per Pod

AWS SQS

triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-east-1.amazonaws.com/123456/my-queue
      queueLength: "5"
      awsRegion: us-east-1
    authenticationRef:
      name: aws-credentials

Prometheus

triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring:9090
      metricName: http_requests_total
      query: |
        sum(rate(http_requests_total{service="api"}[2m]))
      threshold: "100"       # Scale when RPS > 100 per Pod

Cron-Based

triggers:
  - type: cron
    metadata:
      timezone: America/New_York
      start: "0 8 * * 1-5"   # 8 AM weekdays
      end: "0 18 * * 1-5"    # 6 PM weekdays
      desiredReplicas: "10"

PostgreSQL

triggers:
  - type: postgresql
    metadata:
      connectionFromEnv: PG_CONNECTION
      query: "SELECT count(*) FROM jobs WHERE status = 'pending'"
      targetQueryValue: "5"   # 5 pending jobs per Pod

Authentication

KEDA supports multiple authentication methods:

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: aws-credentials
  namespace: production
spec:
  secretTargetRef:
    - parameter: awsAccessKeyID
      name: aws-secret
      key: AWS_ACCESS_KEY_ID
    - parameter: awsSecretAccessKey
      name: aws-secret
      key: AWS_SECRET_ACCESS_KEY

ScaledJob for Batch Processing

For one-shot jobs rather than long-running Deployments:

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: email-sender
spec:
  jobTargetRef:
    template:
      spec:
        containers:
          - name: sender
            image: email-sender:1.0
            resources:
              requests:
                cpu: "100m"
                memory: "128Mi"
        restartPolicy: Never
  pollingInterval: 10
  maxReplicaCount: 50
  triggers:
    - type: rabbitmq
      metadata:
        queueName: emails
        host: amqp://rabbitmq:5672
        mode: QueueLength
        value: "1"     # One job per message

Scale-to-Zero Flow

1. Queue is empty for cooldownPeriod (60s)
2. KEDA sets Deployment replicas to 0
3. All Pods terminate
4. New message arrives in queue
5. KEDA detects message on next pollingInterval (15s)
6. KEDA sets Deployment replicas to 1 (activation)
7. HPA takes over for further scaling (1 → N)
8. Pod starts processing messages

Total activation latency: pollingInterval + Pod startup time (typically 15-60 seconds).

Multiple Triggers

You can combine triggers. The highest replica count wins:

triggers:
  - type: rabbitmq
    metadata:
      queueName: orders
      queueLength: "10"
  - type: cron
    metadata:
      start: "0 8 * * *"
      end: "0 10 * * *"
      desiredReplicas: "5"

During the cron window, at least 5 replicas run. If the queue also demands more, KEDA scales higher.

Monitoring KEDA

# Check ScaledObject status
kubectl get scaledobjects -n production
kubectl describe scaledobject order-processor -n production

# View the HPA KEDA created
kubectl get hpa -n production

# Check KEDA operator logs
kubectl logs -n keda -l app=keda-operator

# KEDA metrics
# keda_scaler_active
# keda_scaler_metrics_value

Best Practices

  1. Set cooldownPeriod appropriately — too short causes flapping, too long wastes resources
  2. Use minReplicaCount: 1 for latency-sensitive services (avoid cold start)
  3. Set maxReplicaCount to prevent runaway scaling from metric spikes
  4. Monitor activation latency — measure time from event to first Pod ready
  5. Use ScaledJob for one-shot work and ScaledObject for long-running Deployments

Why Interviewers Ask This

Standard HPA relies on CPU/memory metrics, which do not reflect actual workload demand for event-driven architectures. KEDA shows you can build efficient, cost-effective systems that scale precisely with demand.

Common Follow-Up Questions

How does KEDA scale to zero?
KEDA manages a ScaledObject that controls the Deployment replicas directly. When the event source reports zero messages, KEDA sets replicas to 0. When new events arrive, it activates the Deployment.
What is the difference between KEDA and the standard HPA?
HPA scales based on resource metrics or custom metrics already exposed in the Kubernetes metrics API. KEDA adds scalers for 60+ external event sources and manages the zero-to-one-to-many transition.
What event sources does KEDA support?
Kafka, RabbitMQ, AWS SQS, Azure Service Bus, Redis Streams, PostgreSQL, Prometheus, Cron, HTTP, and 60+ more via built-in scalers.

Key Takeaways

  • KEDA extends HPA to scale based on external event sources, not just CPU/memory.
  • It enables scale-to-zero for event-driven workloads, reducing costs when idle.
  • KEDA integrates with 60+ event sources including message queues, databases, and cloud services.

Related Questions

You Might Also Like