Kubernetes ErrImagePull
Causes and Fixes
ErrImagePull indicates the kubelet's first attempt to pull a container image has failed. If the pull continues to fail, Kubernetes transitions the pod to ImagePullBackOff. This error surfaces immediately and points to issues with the image reference, registry credentials, or network access.
Symptoms
- Pod status shows ErrImagePull in kubectl get pods output
- kubectl describe pod shows 'Failed to pull image' in events
- Pod quickly transitions from ErrImagePull to ImagePullBackOff on repeated failures
- Container never starts and restart count stays at zero
Common Causes
Step-by-Step Troubleshooting
1. Get the Error Details
Start by identifying which pod is affected and what image it is trying to pull.
kubectl get pods -n <namespace>
kubectl describe pod <pod-name> -n <namespace>
In the Events section, look for the specific error message. Common patterns include:
Failed to pull image "nginx:latst": rpc error: code = NotFound desc = failed to pull and unpack image
Failed to pull image "private.registry.io/app:v1": unexpected status code 401 Unauthorized
The error message is your most important clue for identifying the root cause.
2. Validate the Image Reference
Check for typos in the image name, tag, or registry URL.
# See the exact image reference
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[*].image}'
# Verify it exists (from a machine with registry access)
docker manifest inspect nginx:latst
crane digest private.registry.io/app:v1
Common image reference mistakes:
- Typos in the image name or tag (
nginx:latstinstead ofnginx:latest) - Missing registry prefix for private images
- Wrong port for a private registry (
registry:5000vsregistry:443)
3. Test Registry Authentication
If the image is in a private registry, verify authentication.
# Check if imagePullSecrets are configured
kubectl get pod <pod-name> -o jsonpath='{.spec.imagePullSecrets[*].name}'
# Verify the secret exists and is valid
kubectl get secret <secret-name> -n <namespace> -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | jq .
# Test authentication from your machine
docker login private.registry.io
docker pull private.registry.io/app:v1
If the secret is missing or expired, create a new one:
kubectl create secret docker-registry my-registry-cred \
--docker-server=private.registry.io \
--docker-username=<user> \
--docker-password=<token> \
-n <namespace>
4. Check TLS and Certificate Issues
If your registry uses self-signed certificates, the container runtime may reject the connection.
# Debug from a node
kubectl debug node/<node-name> -it --image=alpine -- sh
apk add curl
curl -v https://private.registry.io/v2/
For containerd, add the registry's CA certificate to the node's trust store or configure it in /etc/containerd/certs.d/:
# /etc/containerd/certs.d/private.registry.io/hosts.toml
[host."https://private.registry.io"]
ca = "/etc/containerd/certs.d/private.registry.io/ca.crt"
5. Verify Network Connectivity from the Node
The node running the pod must be able to reach the registry.
# Check which node the pod is scheduled on
kubectl get pod <pod-name> -o jsonpath='{.spec.nodeName}'
# Debug from that node
kubectl debug node/<node-name> -it --image=busybox -- sh
# Test DNS
nslookup private.registry.io
# Test HTTPS connectivity
wget -O /dev/null https://private.registry.io/v2/
Check for:
- NetworkPolicies blocking egress from the node
- Firewall rules preventing outbound HTTPS (port 443)
- Proxy settings required but not configured on the container runtime
6. Check for Architecture Mismatches
If you run a mixed-architecture cluster (amd64 and arm64 nodes), ensure the image supports the target platform.
# Check the node architecture
kubectl get node <node-name> -o jsonpath='{.status.nodeInfo.architecture}'
# Check image platforms
docker manifest inspect --verbose nginx:latest | jq '.[].Platform'
crane manifest nginx:latest | jq '.manifests[].platform'
If the image does not support the node architecture, either build a multi-arch image or use a nodeSelector to pin the pod to a compatible node.
7. Check Container Runtime Logs
If the above steps do not reveal the issue, check the container runtime logs on the node.
# For containerd
journalctl -u containerd --since "10 minutes ago" | grep -i "pull\|error"
# For CRI-O
journalctl -u crio --since "10 minutes ago" | grep -i "pull\|error"
8. Apply the Fix
Once you identify the root cause, apply the appropriate fix:
# Fix image reference
kubectl set image deployment/<name> <container>=correct-image:tag
# Add imagePullSecrets
kubectl patch pod <pod-name> -p '{"spec":{"imagePullSecrets":[{"name":"my-cred"}]}}'
# For deployments, update the template
kubectl edit deployment <name>
9. Confirm Resolution
# Watch the pod transition to Running
kubectl get pods -w
# Verify the image was pulled
kubectl describe pod <pod-name> | grep "Successfully pulled"
The pod events should show Successfully pulled image and the status should move to Running.
How to Explain This in an Interview
I would explain that ErrImagePull is the initial failure state before ImagePullBackOff kicks in. They represent the same underlying problem — a failed image pull — but at different stages of the retry cycle. I would describe my approach: first verify the image reference and tag, then check credentials and network, and finally look at runtime-level issues like TLS trust. In production, I prevent this by pinning digests, using admission webhooks to validate images, and pre-pulling critical images.
Prevention
- Validate image references in CI/CD pipelines before deploying
- Use image digest pinning instead of mutable tags
- Configure imagePullSecrets on ServiceAccounts for private registries
- Set up registry mirrors or caches for reliability
- Use admission controllers like Kyverno to enforce image policies