What is CSI (Container Storage Interface) and how does Kubernetes use it?

intermediate|architecturedevopssrecloud architectCKA
TL;DR

CSI is a standard interface that allows Kubernetes to work with any storage system without requiring storage-specific code in the Kubernetes codebase. CSI drivers run as pods in the cluster and handle volume provisioning, attaching, mounting, and snapshotting through a well-defined gRPC API.

Detailed Answer

The Container Storage Interface (CSI) is an industry standard that defines how container orchestrators like Kubernetes communicate with storage providers. Before CSI, Kubernetes had "in-tree" storage plugins compiled directly into its codebase. This meant adding support for a new storage system required modifying Kubernetes itself. CSI solves this by defining a gRPC interface that any storage vendor can implement independently.

CSI Architecture

A CSI driver in Kubernetes consists of two main components:

Controller Plugin (runs as a Deployment) -- Handles cluster-level operations that do not need to run on a specific node:

  • CreateVolume / DeleteVolume -- Provision and deprovision storage
  • ControllerPublishVolume / ControllerUnpublishVolume -- Attach/detach volumes to nodes
  • CreateSnapshot / DeleteSnapshot -- Manage volume snapshots
  • ControllerExpandVolume -- Resize volumes

Node Plugin (runs as a DaemonSet) -- Handles node-level operations on every node:

  • NodeStageVolume -- Format and mount the volume to a staging path
  • NodePublishVolume -- Bind-mount from staging path to the pod's volume path
  • NodeUnpublishVolume / NodeUnstageVolume -- Reverse the above operations
  • NodeGetInfo -- Report node topology information

The Volume Lifecycle

Here is the complete flow when a pod requests a persistent volume:

1. User creates a PVC referencing a StorageClass
2. external-provisioner sidecar detects unbound PVC
3. external-provisioner calls CSI Controller CreateVolume
4. Storage backend provisions the volume
5. external-provisioner creates a PV and binds it to the PVC
6. Pod is scheduled to a node
7. external-attacher calls CSI Controller ControllerPublishVolume
8. Volume is attached to the node (e.g., EBS volume attached to EC2)
9. kubelet calls CSI Node NodeStageVolume (format + mount to staging)
10. kubelet calls CSI Node NodePublishVolume (bind-mount to pod path)
11. Pod starts with the volume mounted

Practical Example

StorageClass definition:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: ebs.csi.aws.com    # CSI driver name
parameters:
  type: gp3
  iops: "5000"
  throughput: "250"
  encrypted: "true"
  kmsKeyId: "arn:aws:kms:us-east-1:123456789:key/abc-def"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

PersistentVolumeClaim:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-pvc
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 50Gi

Pod using the PVC:

apiVersion: v1
kind: Pod
metadata:
  name: database
spec:
  containers:
  - name: postgres
    image: postgres:16
    volumeMounts:
    - name: data
      mountPath: /var/lib/postgresql/data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: data-pvc

Installing a CSI Driver

CSI drivers are typically installed via Helm charts or YAML manifests:

# Example: Install AWS EBS CSI driver
helm repo add aws-ebs-csi-driver https://kubernetes-sigs.github.io/aws-ebs-csi-driver
helm install aws-ebs-csi-driver aws-ebs-csi-driver/aws-ebs-csi-driver \
  --namespace kube-system

# Verify CSI driver pods are running
kubectl get pods -n kube-system -l app.kubernetes.io/name=aws-ebs-csi-driver

# Check CSIDriver object
kubectl get csidriver
# NAME              ATTACHREQUIRED   PODINFOONMOUNT   MODES
# ebs.csi.aws.com   true             false            Persistent

# Check CSINode objects (one per node)
kubectl get csinodes

Volume Snapshots

CSI enables volume snapshots for backup and cloning:

# Create a VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: ebs-snapclass
driver: ebs.csi.aws.com
deletionPolicy: Delete

---
# Take a snapshot
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: data-snapshot
spec:
  volumeSnapshotClassName: ebs-snapclass
  source:
    persistentVolumeClaimName: data-pvc

---
# Restore from a snapshot to a new PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-restored
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 50Gi
  dataSource:
    name: data-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io

Volume Expansion

CSI supports online volume expansion when the StorageClass has allowVolumeExpansion: true:

# Expand a PVC from 50Gi to 100Gi
kubectl patch pvc data-pvc -p '{"spec":{"resources":{"requests":{"storage":"100Gi"}}}}'

# Monitor the resize
kubectl get pvc data-pvc -w
# The status will show FileSystemResizePending, then bound with the new size

Troubleshooting CSI

# Check PVC status
kubectl get pvc
kubectl describe pvc data-pvc

# Check PV status and CSI details
kubectl get pv
kubectl describe pv <pv-name>

# Check CSI driver pods
kubectl get pods -n kube-system -l app.kubernetes.io/name=aws-ebs-csi-driver

# View CSI driver logs (controller)
kubectl logs -n kube-system deployment/ebs-csi-controller -c csi-provisioner
kubectl logs -n kube-system deployment/ebs-csi-controller -c ebs-plugin

# View CSI driver logs (node)
kubectl logs -n kube-system daemonset/ebs-csi-node -c ebs-plugin

# Check volume attachments
kubectl get volumeattachments

Common issues include missing IAM permissions for the CSI driver, incorrect StorageClass parameters, volume availability zone mismatches with the node, and exhausted storage quotas.

Why Interviewers Ask This

Interviewers ask about CSI to determine whether a candidate understands how persistent storage integrates with Kubernetes. It reveals knowledge of the plugin architecture, dynamic provisioning workflow, and the practical considerations of managing stateful workloads.

Common Follow-Up Questions

What is the difference between in-tree and CSI volume plugins?
In-tree plugins were compiled directly into Kubernetes binaries. CSI plugins run as external components (DaemonSets and Deployments). Kubernetes is migrating all in-tree plugins to CSI drivers for maintainability and extensibility.
How does dynamic provisioning work with CSI?
A user creates a PVC referencing a StorageClass. The external-provisioner sidecar detects the PVC, calls the CSI driver's CreateVolume RPC, and creates a PV bound to the PVC. The volume is provisioned on-demand without admin intervention.
What are CSI snapshots and how do they work?
CSI supports volume snapshots through VolumeSnapshot, VolumeSnapshotContent, and VolumeSnapshotClass resources. The snapshot controller and CSI driver coordinate to create point-in-time copies of volumes that can be used to restore or clone data.

Key Takeaways

  • CSI decouples storage providers from the Kubernetes core through a standardized gRPC plugin interface
  • CSI drivers are deployed as DaemonSets (node plugin) and Deployments (controller plugin) within the cluster
  • Dynamic provisioning, snapshots, cloning, and volume expansion are all handled through CSI