How Do Resource Quotas Work in Kubernetes?

intermediate|namespacesdevopssreplatform engineerCKA
TL;DR

ResourceQuotas limit the total amount of compute resources (CPU, memory), storage, and object counts that a namespace can consume. When a quota is in place, every Pod must specify resource requests and limits, and new resources that would exceed the quota are rejected.

Detailed Answer

ResourceQuotas constrain the aggregate resource consumption within a namespace. They are essential for multi-tenant clusters where different teams share infrastructure.

Types of Quotas

ResourceQuotas can limit three categories:

1. Compute Resources

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: team-frontend
spec:
  hard:
    requests.cpu: "10"           # Total CPU requests across all Pods
    requests.memory: "20Gi"      # Total memory requests
    limits.cpu: "20"             # Total CPU limits
    limits.memory: "40Gi"        # Total memory limits

2. Storage Resources

apiVersion: v1
kind: ResourceQuota
metadata:
  name: storage-quota
  namespace: team-frontend
spec:
  hard:
    requests.storage: "100Gi"                     # Total PVC storage
    persistentvolumeclaims: "10"                   # Max number of PVCs
    fast-ssd.storageclass.storage.k8s.io/requests.storage: "50Gi"  # Per-StorageClass limit

3. Object Counts

apiVersion: v1
kind: ResourceQuota
metadata:
  name: object-quota
  namespace: team-frontend
spec:
  hard:
    pods: "50"                   # Max Pods in namespace
    services: "20"               # Max Services
    configmaps: "30"             # Max ConfigMaps
    secrets: "30"                # Max Secrets
    services.loadbalancers: "2"  # Max LoadBalancer Services
    services.nodeports: "5"      # Max NodePort Services

Combined Example

A comprehensive quota for a team namespace:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-quota
  namespace: team-frontend
spec:
  hard:
    # Compute
    requests.cpu: "20"
    requests.memory: "40Gi"
    limits.cpu: "40"
    limits.memory: "80Gi"
    # Storage
    requests.storage: "200Gi"
    persistentvolumeclaims: "15"
    # Objects
    pods: "100"
    services: "30"
    configmaps: "50"
    secrets: "50"
    services.loadbalancers: "3"

Viewing Quota Usage

kubectl describe resourcequota team-quota -n team-frontend

# Name:                   team-quota
# Namespace:              team-frontend
# Resource                Used    Hard
# --------                ----    ----
# configmaps              12      50
# limits.cpu              8       40
# limits.memory           16Gi    80Gi
# persistentvolumeclaims  3       15
# pods                    25      100
# requests.cpu            4       20
# requests.memory         8Gi     40Gi
# requests.storage        30Gi    200Gi
# secrets                 8       50
# services                5       30
# services.loadbalancers  1       3

Quota Enforcement

When a quota is exceeded, the API server rejects the request:

kubectl apply -f new-deployment.yaml -n team-frontend
# Error from server (Forbidden): pods "web-abc123" is forbidden:
# exceeded quota: team-quota, requested: requests.cpu=2,
# used: requests.cpu=19, limited: requests.cpu=20

Important: Quotas only block new resource creation. If you reduce a quota below current usage, existing resources continue running. However, you cannot create new resources until usage drops below the new limit.

Quota Scopes

Target quotas to specific Pod types using scopes:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: besteffort-quota
  namespace: team-frontend
spec:
  hard:
    pods: "10"
  scopes:
    - BestEffort    # Only applies to BestEffort QoS Pods

Available scopes:

| Scope | Targets | |---|---| | BestEffort | Pods with no resource requests/limits | | NotBestEffort | Pods with at least one request/limit | | Terminating | Pods with an activeDeadlineSeconds | | NotTerminating | Pods without activeDeadlineSeconds | | PriorityClass | Pods matching a specific priority class |

Pairing with LimitRange

When a compute quota is set, every Pod must specify requests and limits. To avoid requiring every developer to set these manually, create a LimitRange that provides defaults:

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: team-frontend
spec:
  limits:
    - type: Container
      default:
        cpu: "500m"
        memory: "256Mi"
      defaultRequest:
        cpu: "100m"
        memory: "128Mi"

This ensures Pods without explicit resource specs still get reasonable defaults and pass quota validation.

Why Interviewers Ask This

Interviewers ask this to assess whether you can prevent resource exhaustion in shared clusters and enforce fair resource allocation across teams.

Common Follow-Up Questions

What happens when a namespace exceeds its quota?
New resource creation is rejected with a 'forbidden: exceeded quota' error. Existing resources continue running — quotas only block new requests.
Do all Pods need resource requests when a quota is set?
Yes. When a compute quota is set, every container must specify resource requests and limits. Use a LimitRange to provide defaults for containers that omit them.
Can you have multiple ResourceQuotas in one namespace?
Yes. Multiple ResourceQuotas are additive — each applies its own limits independently. The most restrictive constraint wins.

Key Takeaways

  • ResourceQuotas cap total resource consumption per namespace, preventing any team from monopolizing the cluster.
  • When a compute quota exists, all Pods must declare resource requests and limits.
  • Quotas enforce limits at creation time — they do not terminate existing workloads.

Related Questions

You Might Also Like