What Are Persistent Volumes in Kubernetes?

beginner|storagedevopssreCKA
TL;DR

A PersistentVolume (PV) is a cluster-level storage resource provisioned by an administrator or dynamically by a StorageClass. PVs decouple storage from Pod lifecycle, allowing data to persist across Pod restarts and rescheduling.

Detailed Answer

What Is a PersistentVolume?

A PersistentVolume (PV) represents a piece of storage in the cluster. It could be a disk on a cloud provider (AWS EBS, GCE PD, Azure Disk), a network filesystem (NFS, CephFS), or local storage on a node. PVs are cluster-scoped resources, meaning they are not tied to any namespace.

The key design principle is separation of concerns: cluster administrators provision storage (PVs), and developers request storage (PersistentVolumeClaims). This abstraction lets applications remain portable across environments.

Static Provisioning

In static provisioning, an administrator manually creates PV objects that describe available storage:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-nfs-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: ""
  nfs:
    server: 192.168.1.100
    path: /exports/data
# Create the PV
kubectl apply -f pv.yaml

# Check PV status
kubectl get pv
# NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      STORAGECLASS
# my-nfs-pv   10Gi       RWX            Retain           Available

A PV starts in the Available state. When a PersistentVolumeClaim (PVC) matches it, the PV moves to Bound.

PV Lifecycle States

| State | Description | |---|---| | Available | Free, not yet bound to a PVC | | Bound | Bound to a specific PVC | | Released | PVC deleted, but PV not yet reclaimed | | Failed | Automatic reclamation failed |

Reclaim Policies

The persistentVolumeReclaimPolicy determines what happens when the PVC is deleted:

  • Retain: The PV and its data remain. An admin must manually clean up and re-create the PV. Best for critical data.
  • Delete: The PV and the underlying storage (e.g., the EBS volume) are automatically deleted. Common with dynamic provisioning.
  • Recycle (deprecated): The volume is scrubbed (rm -rf /volume/*) and made available again.
# PV with Delete policy (typical for cloud storage)
apiVersion: v1
kind: PersistentVolume
metadata:
  name: cloud-pv
spec:
  capacity:
    storage: 50Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: gp3
  csi:
    driver: ebs.csi.aws.com
    volumeHandle: vol-0abc123def456

PV with Different Storage Backends

AWS EBS via CSI

apiVersion: v1
kind: PersistentVolume
metadata:
  name: ebs-pv
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteOnce
  csi:
    driver: ebs.csi.aws.com
    volumeHandle: vol-0abc123def456789
    fsType: ext4
  persistentVolumeReclaimPolicy: Delete

Local Storage

apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv
spec:
  capacity:
    storage: 200Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /mnt/disks/ssd1
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - worker-node-1

Local PVs require a nodeAffinity field because the storage is physically attached to a specific node.

Inspecting PVs

# List all PVs
kubectl get pv

# Detailed view
kubectl describe pv my-nfs-pv

# View PV in YAML
kubectl get pv my-nfs-pv -o yaml

# Check which PVC is bound to a PV
kubectl get pv -o custom-columns=NAME:.metadata.name,CLAIM:.spec.claimRef.name,STATUS:.status.phase

Dynamic vs. Static Provisioning

While static provisioning works, it requires manual effort for every volume. In production, dynamic provisioning using StorageClasses is strongly preferred. When a PVC references a StorageClass, Kubernetes automatically creates a PV and the underlying storage:

# A PVC that triggers dynamic provisioning
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-claim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: gp3  # StorageClass handles PV creation

With dynamic provisioning, you rarely create PV objects manually. The StorageClass and CSI driver handle everything automatically.

Common Pitfalls

A PVC may remain in Pending state if no PV matches its requirements (size, access mode, storage class). Always check kubectl describe pvc <name> for events explaining why binding failed. Another common issue is forgetting the nodeAffinity on local PVs, which causes scheduling failures when the Pod lands on a node without the expected disk.

Why Interviewers Ask This

Interviewers want to confirm you understand how Kubernetes manages stateful storage and can design solutions for applications that need durable data.

Common Follow-Up Questions

What is the difference between a PV and a regular volume?
Regular volumes are tied to a Pod's lifecycle and are deleted when the Pod is removed. PVs exist independently and can be reused.
What happens to a PV when the PVC bound to it is deleted?
It depends on the reclaim policy: Retain keeps the data, Delete removes the PV and underlying storage, Recycle (deprecated) scrubs the data.
Can multiple Pods use the same PV?
Only if the PV supports ReadWriteMany (RWX) access mode. Most block storage only supports ReadWriteOnce (RWO).

Key Takeaways

  • PVs are cluster-scoped resources that exist independently of Pods
  • The reclaim policy controls what happens to storage when a PVC is deleted
  • Dynamic provisioning via StorageClasses is preferred over static PV creation