Kubernetes VolumeResizeFailed
Causes and Fixes
VolumeResizeFailed occurs when Kubernetes cannot expand a PersistentVolumeClaim to the requested larger size. This can happen at the storage backend level (the underlying volume cannot be resized), at the filesystem level (the filesystem on the volume cannot be expanded), or because volume expansion is not enabled for the StorageClass.
Symptoms
- PVC events show 'VolumeResizeFailed' or 'FailedResizeVolume'
- PVC status shows Resizing condition with errors
- PVC capacity does not update to the requested size
- FileSystemResizePending condition remains on the PVC
- Pod events reference filesystem expansion failures
Common Causes
Step-by-Step Troubleshooting
Volume resize failures leave the PVC in an inconsistent state — the requested size has been changed but the actual capacity has not matched it. This guide walks through identifying the resize failure point and completing the expansion.
1. Check PVC Status and Events
Start by examining the PVC to understand the resize failure.
kubectl describe pvc <pvc-name>
Look at:
- Capacity: The current actual size
- Conditions: Look for Resizing or FileSystemResizePending conditions
- Events: The specific error message from the resize attempt
# Get PVC conditions
kubectl get pvc <pvc-name> -o jsonpath='{.status.conditions}' | jq .
# Compare requested vs actual capacity
kubectl get pvc <pvc-name> -o jsonpath='Requested: {.spec.resources.requests.storage}, Actual: {.status.capacity.storage}'
2. Verify StorageClass Allows Expansion
Check if the StorageClass permits volume expansion.
kubectl get storageclass <storage-class-name> -o jsonpath='{.allowVolumeExpansion}'
If this returns false or empty, volume expansion is not enabled. Update the StorageClass to allow it.
kubectl patch storageclass <storage-class-name> -p '{"allowVolumeExpansion": true}'
Note: Changing the StorageClass to allow expansion does not retroactively fix already-failed resize operations. You may need to retry the resize.
3. Check CSI Driver Expansion Capability
Verify the CSI driver supports volume expansion.
# Check CSI driver capabilities
kubectl get csidriver <driver-name> -o yaml
# Look for VolumeExpansion in the CSI driver's capabilities
kubectl get csidriver -o custom-columns=NAME:.metadata.name,EXPANSION:.spec.volumeLifecycleModes
If the CSI driver does not support volume expansion, you cannot resize volumes managed by this driver. You would need to create a new, larger PVC and migrate data manually.
4. Check if Online Expansion Is Supported
Some storage backends require the volume to be unmounted before resizing.
# Check if the CSI driver supports online expansion
kubectl get csidriver <driver-name> -o jsonpath='{.spec.requiresRepublish}'
For backends that require offline expansion, you need to stop the pod using the volume first.
# Scale down the workload to detach the volume
kubectl scale deployment <deployment-name> --replicas=0
# Wait for the pod to terminate
kubectl get pods -l <selector> -w
# Retry the resize by re-applying the PVC with the new size
kubectl apply -f pvc.yaml
# Wait for resize to complete
kubectl get pvc <pvc-name> -w
# Scale back up
kubectl scale deployment <deployment-name> --replicas=<original-count>
5. Check Cloud Provider Limitations
Cloud providers have specific constraints on volume resizing.
# Get the volume ID
kubectl get pv <pv-name> -o jsonpath='{.spec.csi.volumeHandle}'
Provider-specific checks:
AWS EBS: Volumes can only be modified once every 6 hours. Check the modification state.
aws ec2 describe-volumes-modifications --volume-ids <vol-id>
GCP PD: Check if the disk is in a modifying state.
gcloud compute disks describe <disk-name> --zone=<zone> --format='get(status)'
Azure Disk: Check disk size limits for the disk SKU.
az disk show --resource-group <rg> --name <disk-name> --query '[diskSizeGb, diskState]'
6. Check Filesystem Expansion
If the block device was resized but the filesystem was not expanded, the PVC will show a FileSystemResizePending condition.
# Check for FileSystemResizePending condition
kubectl get pvc <pvc-name> -o jsonpath='{.status.conditions[?(@.type=="FileSystemResizePending")]}'
The filesystem expansion happens on the node when the volume is mounted. Check kubelet logs for filesystem expansion errors.
# On the node
journalctl -u kubelet --no-pager --since "30 minutes ago" | grep -i "resize\|expand\|filesystem"
You may need to restart the pod to trigger the kubelet to retry filesystem expansion.
kubectl delete pod <pod-name>
7. Manual Filesystem Expansion
If automatic filesystem expansion fails, you can expand the filesystem manually.
# Stop the pod
kubectl scale deployment <deployment-name> --replicas=0
# Debug the node where the volume is attached
kubectl debug node/<node-name> -it --image=ubuntu -- bash
# Find the volume device
lsblk
# For ext4 filesystems
resize2fs /dev/<device>
# For XFS filesystems
xfs_growfs /dev/<device>
# Scale back up
kubectl scale deployment <deployment-name> --replicas=<count>
8. Handle Failed Resize Recovery
If the resize is stuck in a failed state, you may need to reset it.
# Check the current PVC conditions
kubectl get pvc <pvc-name> -o yaml | grep -A10 conditions
# If the resize is stuck, you can try editing the PVC back to the original size
# then requesting the resize again
kubectl patch pvc <pvc-name> -p '{"spec":{"resources":{"requests":{"storage":"<original-size>"}}}}'
# Then request the new size again
kubectl patch pvc <pvc-name> -p '{"spec":{"resources":{"requests":{"storage":"<new-size>"}}}}'
Note: In some Kubernetes versions, reverting the PVC size is not allowed. In that case, you may need to work with the storage backend directly to complete the resize.
9. Migrate Data to a New Volume
If resize is not possible (driver limitation, corruption, etc.), create a new larger PVC and migrate data.
# Create a new PVC with the desired size
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: <pvc-name>-new
spec:
accessModes:
- ReadWriteOnce
storageClassName: <storage-class>
resources:
requests:
storage: <new-size>
EOF
# Wait for the new PVC to be bound
kubectl get pvc <pvc-name>-new -w
# Use a job to copy data from old PVC to new PVC
cat <<EOF | kubectl apply -f -
apiVersion: batch/v1
kind: Job
metadata:
name: data-migration
spec:
template:
spec:
containers:
- name: migrate
image: ubuntu
command: ["bash", "-c", "cp -a /old/. /new/"]
volumeMounts:
- name: old
mountPath: /old
- name: new
mountPath: /new
volumes:
- name: old
persistentVolumeClaim:
claimName: <pvc-name>
- name: new
persistentVolumeClaim:
claimName: <pvc-name>-new
restartPolicy: Never
EOF
10. Verify the Resize Completed
After resolution, confirm the PVC has the correct capacity.
# Check PVC capacity matches requested size
kubectl get pvc <pvc-name> -o custom-columns=NAME:.metadata.name,REQUESTED:.spec.resources.requests.storage,ACTUAL:.status.capacity.storage
# Verify no resize conditions remain
kubectl get pvc <pvc-name> -o jsonpath='{.status.conditions}'
# Verify filesystem size inside the pod
kubectl exec <pod-name> -- df -h <mount-path>
The resize is complete when the PVC's actual capacity matches the requested size, no Resizing or FileSystemResizePending conditions remain, and the filesystem inside the pod reflects the new size.
How to Explain This in an Interview
I would explain the volume expansion workflow: when a PVC's requested size is increased, the PVC controller detects the change and triggers the storage backend to resize the underlying volume. Once the backend reports success, the kubelet expands the filesystem when the volume is next mounted (or online if supported). I'd discuss the two-phase process — controller-side resize and node-side filesystem expansion — and how each can fail independently. I'd cover the allowVolumeExpansion field on StorageClass, the CSI driver's volume expansion capability, and the differences between offline and online expansion. I'd also mention that shrinking is never supported and that some providers have cooldown periods between resize operations.
Prevention
- Enable allowVolumeExpansion on StorageClasses that may need resizing
- Monitor PVC usage and expand proactively before hitting capacity
- Verify the CSI driver supports volume expansion before relying on it
- Test volume expansion procedures in non-production environments
- Document any provider-specific resize limitations (cooldowns, max sizes)