What Is Progressive Delivery in Kubernetes?

advanced|deploymentsdevopssreplatform engineerCKACKAD
TL;DR

Progressive delivery is an advanced deployment strategy that gradually shifts traffic to a new version while continuously analyzing metrics. Tools like Argo Rollouts and Flagger automate canary analysis, traffic shifting, and automatic rollback.

Detailed Answer

Progressive delivery builds on canary and blue-green deployments by automating the entire release process. Instead of manually watching dashboards, you define success criteria upfront, and the system automatically promotes or rolls back based on real metrics.

How Progressive Delivery Works

1. Deploy new version alongside old version
2. Route a small percentage of traffic (e.g., 5%) to the new version
3. Analyze metrics for a defined period (e.g., 5 minutes)
4. If metrics are healthy, increase traffic (10%, 25%, 50%, 100%)
5. If metrics degrade, automatically roll back to the old version

Argo Rollouts

Argo Rollouts is the most widely adopted progressive delivery tool for Kubernetes. It introduces a Rollout CRD that replaces the Deployment:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: api-server
spec:
  replicas: 5
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
        - name: api
          image: api-server:2.0
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: "250m"
              memory: "256Mi"
  strategy:
    canary:
      canaryService: api-canary
      stableService: api-stable
      trafficRouting:
        istio:
          virtualServices:
            - name: api-vsvc
              routes:
                - primary
      steps:
        - setWeight: 5
        - pause: { duration: 5m }
        - analysis:
            templates:
              - templateName: success-rate
            args:
              - name: service-name
                value: api-canary
        - setWeight: 25
        - pause: { duration: 5m }
        - analysis:
            templates:
              - templateName: success-rate
        - setWeight: 50
        - pause: { duration: 10m }
        - setWeight: 100

Analysis Templates

Analysis templates define the success criteria. Argo Rollouts queries Prometheus (or other providers) and compares results against thresholds:

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate
spec:
  args:
    - name: service-name
  metrics:
    - name: success-rate
      interval: 60s
      count: 5
      successCondition: result[0] >= 0.99
      failureLimit: 2
      provider:
        prometheus:
          address: http://prometheus.monitoring:9090
          query: |
            sum(rate(http_requests_total{
              service="{{args.service-name}}",
              status=~"2.."
            }[2m])) /
            sum(rate(http_requests_total{
              service="{{args.service-name}}"
            }[2m]))
    - name: latency-p99
      interval: 60s
      count: 5
      successCondition: result[0] <= 500
      failureLimit: 2
      provider:
        prometheus:
          address: http://prometheus.monitoring:9090
          query: |
            histogram_quantile(0.99,
              sum(rate(http_request_duration_ms_bucket{
                service="{{args.service-name}}"
              }[2m])) by (le))

Flagger Alternative

Flagger works differently — it watches a Deployment and creates canary infrastructure automatically:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: api-server
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  service:
    port: 8080
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
      - name: request-success-rate
        thresholdRange:
          min: 99
        interval: 1m
      - name: request-duration
        thresholdRange:
          max: 500
        interval: 1m
    webhooks:
      - name: load-test
        url: http://flagger-loadtester/
        metadata:
          cmd: "hey -z 1m -q 10 http://api-server-canary:8080/"

Traffic Splitting Backends

Progressive delivery requires a traffic splitting mechanism:

| Backend | Tool Support | Traffic Precision | |---------|-------------|-------------------| | Istio VirtualService | Argo Rollouts, Flagger | Percentage-based | | Linkerd TrafficSplit | Flagger | Percentage-based | | Nginx Ingress | Argo Rollouts, Flagger | Annotation-based canary | | AWS ALB Ingress | Argo Rollouts | Weighted target groups | | Gateway API | Argo Rollouts | HTTPRoute weights |

Monitoring a Progressive Rollout

# Argo Rollouts
kubectl argo rollouts get rollout api-server -w
kubectl argo rollouts status api-server

# Check analysis run results
kubectl get analysisrun -l rollouts-pod-template-hash

# Manual promotion (if paused for approval)
kubectl argo rollouts promote api-server

# Manual abort
kubectl argo rollouts abort api-server

When to Use Progressive Delivery

| Scenario | Recommendation | |----------|---------------| | Internal tools with low traffic | Standard rolling update is sufficient | | Customer-facing APIs | Progressive delivery with metric analysis | | Stateful services with schema changes | Blue-green with manual verification | | High-traffic services (>1000 RPS) | Progressive delivery provides statistical significance |

Best Practices

  1. Define clear SLOs before automating canary analysis — you need thresholds to automate against
  2. Start with generous thresholds and tighten over time to avoid false positives
  3. Include load testing in analysis steps to generate enough traffic for statistical significance
  4. Use header-based routing during development to test canary Pods directly
  5. Set up alerts for automated rollbacks so the team is notified even when the system self-heals

Why Interviewers Ask This

This question tests your understanding of modern deployment strategies beyond basic rolling updates. Companies adopting progressive delivery see fewer production incidents from bad releases.

Common Follow-Up Questions

How does progressive delivery differ from a standard canary deployment?
A standard canary is manual — you deploy, observe, and promote. Progressive delivery automates the entire cycle: deploy, analyze metrics, shift traffic incrementally, and auto-rollback on failure.
What metrics are typically used for canary analysis?
HTTP success rate (5xx ratio), latency percentiles (p99), error rate, and custom business metrics like conversion rate or queue depth.
What tools enable progressive delivery on Kubernetes?
Argo Rollouts and Flagger are the most popular. Both integrate with service meshes (Istio, Linkerd) and ingress controllers (Nginx, Contour) for traffic splitting.

Key Takeaways

  • Progressive delivery automates metric-driven traffic shifting and automatic rollback.
  • Argo Rollouts extends Kubernetes with a Rollout CRD that replaces Deployment for advanced strategies.
  • Integration with a service mesh or ingress controller is required for weighted traffic splitting.

Related Questions

You Might Also Like