How Do Resource Quotas Work in Kubernetes?

Q: How Do Resource Quotas Work in Kubernetes?

ResourceQuotas limit the total amount of compute resources (CPU, memory), storage, and object counts that a namespace can consume. When a quota is in place, every Pod must specify resource requests and limits, and new resources that would exceed the quota are rejected.

Detailed Answer

ResourceQuotas constrain the aggregate resource consumption within a namespace. They are essential for multi-tenant clusters where different teams share infrastructure.

Types of Quotas

ResourceQuotas can limit three categories:

1. Compute Resources

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: team-frontend
spec:
  hard:
    requests.cpu: "10"           # Total CPU requests across all Pods
    requests.memory: "20Gi"      # Total memory requests
    limits.cpu: "20"             # Total CPU limits
    limits.memory: "40Gi"        # Total memory limits

2. Storage Resources

apiVersion: v1
kind: ResourceQuota
metadata:
  name: storage-quota
  namespace: team-frontend
spec:
  hard:
    requests.storage: "100Gi"                     # Total PVC storage
    persistentvolumeclaims: "10"                   # Max number of PVCs
    fast-ssd.storageclass.storage.k8s.io/requests.storage: "50Gi"  # Per-StorageClass limit

3. Object Counts

apiVersion: v1
kind: ResourceQuota
metadata:
  name: object-quota
  namespace: team-frontend
spec:
  hard:
    pods: "50"                   # Max Pods in namespace
    services: "20"               # Max Services
    configmaps: "30"             # Max ConfigMaps
    secrets: "30"                # Max Secrets
    services.loadbalancers: "2"  # Max LoadBalancer Services
    services.nodeports: "5"      # Max NodePort Services

Combined Example

A comprehensive quota for a team namespace:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-quota
  namespace: team-frontend
spec:
  hard:
    # Compute
    requests.cpu: "20"
    requests.memory: "40Gi"
    limits.cpu: "40"
    limits.memory: "80Gi"
    # Storage
    requests.storage: "200Gi"
    persistentvolumeclaims: "15"
    # Objects
    pods: "100"
    services: "30"
    configmaps: "50"
    secrets: "50"
    services.loadbalancers: "3"

Viewing Quota Usage

kubectl describe resourcequota team-quota -n team-frontend

# Name:                   team-quota
# Namespace:              team-frontend
# Resource                Used    Hard
# --------                ----    ----
# configmaps              12      50
# limits.cpu              8       40
# limits.memory           16Gi    80Gi
# persistentvolumeclaims  3       15
# pods                    25      100
# requests.cpu            4       20
# requests.memory         8Gi     40Gi
# requests.storage        30Gi    200Gi
# secrets                 8       50
# services                5       30
# services.loadbalancers  1       3

Quota Enforcement

When a quota is exceeded, the API server rejects the request:

kubectl apply -f new-deployment.yaml -n team-frontend
# Error from server (Forbidden): pods "web-abc123" is forbidden:
# exceeded quota: team-quota, requested: requests.cpu=2,
# used: requests.cpu=19, limited: requests.cpu=20

Important: Quotas only block new resource creation. If you reduce a quota below current usage, existing resources continue running. However, you cannot create new resources until usage drops below the new limit.

Quota Scopes

Target quotas to specific Pod types using scopes:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: besteffort-quota
  namespace: team-frontend
spec:
  hard:
    pods: "10"
  scopes:
    - BestEffort    # Only applies to BestEffort QoS Pods

Available scopes:

| Scope | Targets | |---|---| | BestEffort | Pods with no resource requests/limits | | NotBestEffort | Pods with at least one request/limit | | Terminating | Pods with an activeDeadlineSeconds | | NotTerminating | Pods without activeDeadlineSeconds | | PriorityClass | Pods matching a specific priority class |

Pairing with LimitRange

When a compute quota is set, every Pod must specify requests and limits. To avoid requiring every developer to set these manually, create a LimitRange that provides defaults:

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: team-frontend
spec:
  limits:
    - type: Container
      default:
        cpu: "500m"
        memory: "256Mi"
      defaultRequest:
        cpu: "100m"
        memory: "128Mi"

This ensures Pods without explicit resource specs still get reasonable defaults and pass quota validation.

Detailed Answer

Types of Quotas

1. Compute Resources

2. Storage Resources

3. Object Counts

Combined Example

Viewing Quota Usage

Quota Enforcement

Quota Scopes

Pairing with LimitRange

Why Interviewers Ask This

Common Follow-Up Questions

Key Takeaways

Related Questions

You Might Also Like