What Is KEDA and How Does Event-Driven Autoscaling Work?
KEDA (Kubernetes Event-Driven Autoscaling) extends the HPA to scale workloads based on event sources like message queues, databases, and custom metrics. It can scale to zero Pods when there is no work, unlike the standard HPA.
Detailed Answer
KEDA (Kubernetes Event-Driven Autoscaling) is a CNCF graduated project that allows Kubernetes to scale workloads based on the number of events in external systems — message queues, streams, databases, and more.
Why Standard HPA Is Not Enough
The HPA scales based on observed CPU, memory, or custom metrics. For event-driven workloads, this is inadequate:
Standard HPA:
Queue has 10,000 messages → Pods at 10% CPU → HPA does NOT scale up
KEDA:
Queue has 10,000 messages → KEDA scales to 50 Pods to drain the queue
Queue empty → KEDA scales to 0 Pods
Architecture
External Source → KEDA Scaler → Metrics Adapter → HPA → Deployment
(Kafka, SQS, (polls (exposes as (scales
RabbitMQ) metrics) external metric) replicas)
KEDA creates and manages an HPA behind the scenes, feeding it external metrics.
Installation
helm repo add kedacore https://kedacore.github.io/charts
helm install keda kedacore/keda --namespace keda --create-namespace
ScaledObject: The Core Resource
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: order-processor
namespace: production
spec:
scaleTargetRef:
name: order-processor # Deployment name
pollingInterval: 15 # Check every 15 seconds
cooldownPeriod: 60 # Wait before scaling to zero
minReplicaCount: 0 # Scale to zero when idle
maxReplicaCount: 100 # Maximum replicas
triggers:
- type: rabbitmq
metadata:
queueName: orders
host: amqp://rabbitmq.production:5672
queueLength: "10" # Target: 10 messages per Pod
This scales the order-processor Deployment based on the RabbitMQ queue length. With 500 messages and a target of 10 per Pod, KEDA scales to 50 replicas.
Common Scaler Examples
Kafka
triggers:
- type: kafka
metadata:
bootstrapServers: kafka.production:9092
consumerGroup: my-group
topic: events
lagThreshold: "100" # Scale when lag > 100 per Pod
AWS SQS
triggers:
- type: aws-sqs-queue
metadata:
queueURL: https://sqs.us-east-1.amazonaws.com/123456/my-queue
queueLength: "5"
awsRegion: us-east-1
authenticationRef:
name: aws-credentials
Prometheus
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus.monitoring:9090
metricName: http_requests_total
query: |
sum(rate(http_requests_total{service="api"}[2m]))
threshold: "100" # Scale when RPS > 100 per Pod
Cron-Based
triggers:
- type: cron
metadata:
timezone: America/New_York
start: "0 8 * * 1-5" # 8 AM weekdays
end: "0 18 * * 1-5" # 6 PM weekdays
desiredReplicas: "10"
PostgreSQL
triggers:
- type: postgresql
metadata:
connectionFromEnv: PG_CONNECTION
query: "SELECT count(*) FROM jobs WHERE status = 'pending'"
targetQueryValue: "5" # 5 pending jobs per Pod
Authentication
KEDA supports multiple authentication methods:
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: aws-credentials
namespace: production
spec:
secretTargetRef:
- parameter: awsAccessKeyID
name: aws-secret
key: AWS_ACCESS_KEY_ID
- parameter: awsSecretAccessKey
name: aws-secret
key: AWS_SECRET_ACCESS_KEY
ScaledJob for Batch Processing
For one-shot jobs rather than long-running Deployments:
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
name: email-sender
spec:
jobTargetRef:
template:
spec:
containers:
- name: sender
image: email-sender:1.0
resources:
requests:
cpu: "100m"
memory: "128Mi"
restartPolicy: Never
pollingInterval: 10
maxReplicaCount: 50
triggers:
- type: rabbitmq
metadata:
queueName: emails
host: amqp://rabbitmq:5672
mode: QueueLength
value: "1" # One job per message
Scale-to-Zero Flow
1. Queue is empty for cooldownPeriod (60s)
2. KEDA sets Deployment replicas to 0
3. All Pods terminate
4. New message arrives in queue
5. KEDA detects message on next pollingInterval (15s)
6. KEDA sets Deployment replicas to 1 (activation)
7. HPA takes over for further scaling (1 → N)
8. Pod starts processing messages
Total activation latency: pollingInterval + Pod startup time (typically 15-60 seconds).
Multiple Triggers
You can combine triggers. The highest replica count wins:
triggers:
- type: rabbitmq
metadata:
queueName: orders
queueLength: "10"
- type: cron
metadata:
start: "0 8 * * *"
end: "0 10 * * *"
desiredReplicas: "5"
During the cron window, at least 5 replicas run. If the queue also demands more, KEDA scales higher.
Monitoring KEDA
# Check ScaledObject status
kubectl get scaledobjects -n production
kubectl describe scaledobject order-processor -n production
# View the HPA KEDA created
kubectl get hpa -n production
# Check KEDA operator logs
kubectl logs -n keda -l app=keda-operator
# KEDA metrics
# keda_scaler_active
# keda_scaler_metrics_value
Best Practices
- Set cooldownPeriod appropriately — too short causes flapping, too long wastes resources
- Use minReplicaCount: 1 for latency-sensitive services (avoid cold start)
- Set maxReplicaCount to prevent runaway scaling from metric spikes
- Monitor activation latency — measure time from event to first Pod ready
- Use ScaledJob for one-shot work and ScaledObject for long-running Deployments
Why Interviewers Ask This
Standard HPA relies on CPU/memory metrics, which do not reflect actual workload demand for event-driven architectures. KEDA shows you can build efficient, cost-effective systems that scale precisely with demand.
Common Follow-Up Questions
Key Takeaways
- KEDA extends HPA to scale based on external event sources, not just CPU/memory.
- It enables scale-to-zero for event-driven workloads, reducing costs when idle.
- KEDA integrates with 60+ event sources including message queues, databases, and cloud services.