Kubernetes NetworkUnavailable

Causes and Fixes

NetworkUnavailable is a node condition indicating that the network for the node is not correctly configured. This typically means the CNI plugin has not yet set up networking on the node, or the network plugin has failed. Pods scheduled on the node cannot communicate with the cluster network.

Symptoms

  • Node condition NetworkUnavailable shows as True in kubectl describe node
  • Pods on the node are stuck in ContainerCreating state
  • Pod events show 'network not ready: NetworkReady=false reason:NetworkPluginNotReady'
  • Inter-pod communication fails for pods on the affected node
  • Newly joined nodes cannot run workloads despite being in Ready state

Common Causes

1
CNI plugin not installed
The cluster was set up without installing a CNI plugin like Calico, Flannel, or Cilium. Kubernetes requires a CNI plugin to configure pod networking.
2
CNI plugin crashed or failed
The CNI plugin DaemonSet pod on the affected node has crashed, been evicted, or is in an error state, leaving the node without a functioning network layer.
3
CNI configuration missing or corrupt
The CNI configuration files in /etc/cni/net.d/ are missing, malformed, or reference a binary that does not exist in /opt/cni/bin/.
4
Node network interface issue
The underlying host network interface is down, misconfigured, or has lost connectivity to the cluster's network fabric.
5
Cloud provider route configuration failure
In cloud environments, the cloud controller manager failed to program routes for the node's pod CIDR, preventing cross-node pod communication.

Step-by-Step Troubleshooting

The NetworkUnavailable condition means that a node's pod network has not been properly initialized. This is almost always related to the Container Network Interface (CNI) plugin. This guide walks through systematic diagnosis from the node condition down to the CNI configuration.

1. Identify Affected Nodes

Start by listing all nodes and their network status.

kubectl get nodes -o custom-columns=NAME:.metadata.name,READY:.status.conditions[?(@.type=="Ready")].status,NETWORK:.status.conditions[?(@.type=="NetworkUnavailable")].status

If the NETWORK column shows True for any node, that node has the NetworkUnavailable condition. Describe the node for more detail.

kubectl describe node <node-name>

Look at the Conditions section for the NetworkUnavailable entry and its message, which often contains a clue like "NetworkPluginNotReady" or "no CNI configuration."

2. Check if a CNI Plugin Is Installed

The most common cause of NetworkUnavailable is simply not having a CNI plugin deployed.

# Check for common CNI DaemonSets
kubectl get daemonset -n kube-system

# Look for pods like calico-node, kube-flannel, cilium, weave-net
kubectl get pods -n kube-system -l k8s-app=calico-node
kubectl get pods -n kube-system -l app=flannel
kubectl get pods -n kube-system -l k8s-app=cilium

If no CNI plugin is found, you need to install one. For example, to install Calico:

kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml

3. Verify CNI Plugin Pod Health on the Affected Node

If a CNI plugin is installed, check whether its pod is running on the affected node.

# Find CNI pods on the specific node
kubectl get pods -n kube-system --field-selector spec.nodeName=<node-name> | grep -E 'calico|flannel|cilium|weave'

# Check the CNI pod's logs
kubectl logs -n kube-system <cni-pod-name> --tail=100

# If the CNI pod has multiple containers, check all of them
kubectl logs -n kube-system <cni-pod-name> -c install-cni --tail=50
kubectl logs -n kube-system <cni-pod-name> -c calico-node --tail=50

Common issues include the CNI pod being in CrashLoopBackOff, ImagePullBackOff, or Init state. The logs will reveal the specific failure.

4. Inspect CNI Configuration on the Node

SSH into the affected node or use a debug pod to check the CNI configuration files.

kubectl debug node/<node-name> -it --image=ubuntu -- bash

# Check CNI configuration directory
ls -la /host/etc/cni/net.d/

# Read the CNI configuration
cat /host/etc/cni/net.d/10-calico.conflist
# or
cat /host/etc/cni/net.d/10-flannel.conflist

If the directory is empty, the CNI plugin has not written its configuration yet. This usually means the CNI install init container has not completed.

5. Verify CNI Binaries

The CNI configuration references binary plugins that must be present on the node.

# From the debug pod
ls -la /host/opt/cni/bin/

# Verify the binaries referenced in the CNI config exist
# Common required binaries: bridge, host-local, loopback, portmap, bandwidth

If binaries are missing, the CNI plugin's init container may have failed to copy them. Check the init container logs of the CNI DaemonSet pod.

6. Check Kubelet Logs for CNI Errors

The kubelet is responsible for invoking the CNI plugin when creating pod sandboxes.

# From the debug pod or via SSH
journalctl -u kubelet --no-pager --since "30 minutes ago" | grep -i cni

# Look for messages like:
# "Unable to update cni config"
# "Network plugin not ready"
# "cni config uninitialized"

The kubelet logs will show exactly which step of CNI initialization is failing and often include the specific error from the CNI binary.

7. Check Cloud Controller Manager (Cloud Environments)

In cloud-managed clusters (AWS, GCP, Azure), the cloud controller manager is responsible for programming routes so that each node's pod CIDR is routable.

# Check cloud controller manager pods
kubectl get pods -n kube-system | grep cloud-controller

# Check its logs for route programming errors
kubectl logs -n kube-system <cloud-controller-pod> --tail=100 | grep -i route

# Verify node's pod CIDR assignment
kubectl get node <node-name> -o jsonpath='{.spec.podCIDR}'

If the podCIDR is empty, the controller manager has not assigned a CIDR to the node. This can happen if the cluster's pod CIDR range is exhausted or the controller manager does not have proper cloud API permissions.

8. Verify Underlying Host Networking

Sometimes the issue is with the host's network interface itself.

# From the debug pod on the node
ip addr show
ip route show

# Check if the node can reach other nodes
ping <other-node-ip>

# Check if the node can reach the API server
curl -k https://<api-server-ip>:6443/healthz

# Check for network interface errors
ip -s link show

A downed interface, missing default route, or firewall rules blocking VXLAN/IPIP traffic (typically UDP port 4789 or IP protocol 4) can prevent CNI plugins from establishing the overlay network.

9. Restart the CNI Plugin on the Affected Node

If the CNI configuration and binaries look correct but the plugin is not functioning, restart the CNI pod on the affected node.

# Delete the CNI pod on the affected node — the DaemonSet will recreate it
kubectl delete pod -n kube-system <cni-pod-name>

# Watch for the new pod to come up
kubectl get pods -n kube-system --field-selector spec.nodeName=<node-name> -w

After the CNI pod restarts and completes its initialization, it should clear the NetworkUnavailable condition on the node.

10. Manually Clear the Condition (Last Resort)

In rare cases where the CNI plugin is working but the condition was not properly cleared, you can manually patch it. This should only be done after verifying that networking is actually functional.

kubectl patch node <node-name> --type=merge -p '
{
  "status": {
    "conditions": [
      {
        "type": "NetworkUnavailable",
        "status": "False",
        "reason": "ManuallyCleared",
        "message": "Network verified as functional"
      }
    ]
  }
}'

Note that this is a temporary workaround. If the underlying issue persists, the condition may be set back to True by the CNI plugin or cloud controller.

11. Verify Resolution

After remediation, confirm the node's network is functional.

# Check the node condition
kubectl get node <node-name> -o jsonpath='{.status.conditions[?(@.type=="NetworkUnavailable")]}'

# Schedule a test pod on the node
kubectl run net-test --image=busybox --overrides='{"spec":{"nodeName":"<node-name>"}}' --restart=Never -- sleep 3600

# Verify the pod gets an IP and can communicate
kubectl get pod net-test -o wide
kubectl exec net-test -- wget -qO- http://kubernetes.default.svc.cluster.local/healthz

# Clean up the test pod
kubectl delete pod net-test

If the test pod receives an IP, can resolve DNS, and can reach the Kubernetes API service, the node's network is fully functional and the issue is resolved.

How to Explain This in an Interview

I would explain that NetworkUnavailable is set by the CNI plugin or cloud controller manager, not by the kubelet itself. When a node joins the cluster, the kubelet reports NetworkUnavailable as True until a network plugin clears it by setting up the node's network. I'd discuss the CNI specification, how the kubelet invokes CNI plugins during pod sandbox creation, and the role of the cloud-controller-manager in programming routes. For troubleshooting, I'd check the CNI DaemonSet pods, inspect /etc/cni/net.d/ for configuration, verify CNI binaries, and review kubelet and CNI logs.

Prevention

  • Always install a CNI plugin before joining nodes to the cluster
  • Monitor CNI DaemonSet health with pod disruption budgets
  • Use node readiness gates to prevent scheduling until networking is confirmed
  • Automate CNI deployment as part of cluster bootstrap
  • Test network plugin upgrades in staging before applying to production

Related Errors