Kubernetes Connection Refused
Causes and Fixes
Connection Refused errors in Kubernetes occur when a TCP connection attempt to a service, pod, or API endpoint is actively rejected. Unlike timeouts where the connection hangs, a refused connection means the target is reachable but nothing is listening on the specified port, or the connection is being explicitly rejected.
Symptoms
- Error messages showing 'connection refused' or 'ECONNREFUSED'
- curl or wget returns 'Connection refused' with the target IP and port
- Application logs show failed connections to dependent services
- Readiness or liveness probes fail with connection refused
- kubectl commands fail with 'connection refused' to the API server
Common Causes
Step-by-Step Troubleshooting
Connection refused is one of the most specific network errors — it tells you that the target IP is reachable but nothing is accepting connections on the specified port. This narrows the debugging considerably compared to timeouts or generic network errors.
1. Identify What Is Being Connected To
First, determine the exact target of the failed connection.
# If the error comes from a pod, check its logs
kubectl logs <pod-name> --tail=50
# Look for the specific host:port in error messages
# Examples:
# "dial tcp 10.96.45.12:5432: connect: connection refused"
# "ECONNREFUSED 10.244.1.5:8080"
Note the IP address and port. Determine whether the IP is a Service ClusterIP, a pod IP, or a node IP.
# Check if it is a Service ClusterIP
kubectl get services --all-namespaces | grep <ip-address>
# Check if it is a pod IP
kubectl get pods --all-namespaces -o wide | grep <ip-address>
2. Test Direct Pod Connectivity
If the target is a pod, verify whether the application is listening on the expected port.
# Get the pod IP
kubectl get pod <target-pod> -o wide
# Exec into the target pod and check listening ports
kubectl exec <target-pod> -- ss -tlnp
# or
kubectl exec <target-pod> -- netstat -tlnp
Look at the output to confirm:
- The application is listening on the expected port
- The bind address is
0.0.0.0(all interfaces) or::(IPv6 all), NOT127.0.0.1
If the application binds to 127.0.0.1, it is only accessible from within the container itself. Other pods and Services cannot reach it.
# Example: application listening on localhost only
# Proto Recv-Q Send-Q Local Address Foreign Address State
# tcp 0 0 127.0.0.1:8080 0.0.0.0:* LISTEN <-- Problem!
# What you need:
# tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN <-- Correct
3. Check Container Port Configuration
Verify the container's port specification matches the application.
kubectl get pod <target-pod> -o jsonpath='{range .spec.containers[*]}{.name}: {range .ports[*]}{.containerPort}/{.protocol} {end}{"\n"}{end}'
While the containerPort field in the pod spec is informational and does not actually restrict which ports the container can listen on, it should match the application's actual listening port for clarity and for Service targetPort references by name.
4. Verify Service Port Mapping
If the connection goes through a Service, check the port mapping chain.
kubectl describe service <service-name>
The chain is: client connects to Service Port -> forwarded to targetPort on the pod. If targetPort is wrong, the connection reaches the pod but hits a port with nothing listening.
# Verify the full chain
kubectl get service <service-name> -o jsonpath='Port: {.spec.ports[0].port} -> TargetPort: {.spec.ports[0].targetPort}'
# Verify the pod is actually listening on the targetPort
kubectl exec <pod-name> -- ss -tlnp | grep <target-port>
5. Check if the Application Has Started
Connection refused during the initial startup period is normal if the application takes time to initialize.
# Check pod age and restart count
kubectl get pod <target-pod>
# Check container state
kubectl get pod <target-pod> -o jsonpath='{.status.containerStatuses[0].state}' | jq .
# Check logs for startup progress
kubectl logs <target-pod> --tail=30
If the application needs significant startup time, configure a startup probe to prevent premature traffic routing and liveness kills.
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10
6. Check for Application Crashes
The application may have crashed, leaving a brief window where the container is running but the process is not listening.
# Check restart count
kubectl get pod <target-pod> -o jsonpath='{.status.containerStatuses[0].restartCount}'
# Check if the container recently restarted
kubectl describe pod <target-pod> | grep -A5 "Last State"
# Check previous logs
kubectl logs <target-pod> --previous
Frequent restarts indicate the application is crashing repeatedly. Investigate the crash cause from the logs.
7. Debug API Server Connection Refused
If kubectl itself returns connection refused, the issue is with the API server.
# Check the API server endpoint in kubeconfig
kubectl config view --minify | grep server
# Test direct connectivity
curl -k https://<api-server-ip>:6443/healthz
# If running inside the cluster, check the kubernetes service
kubectl get service kubernetes -n default
For self-managed clusters, check the API server pod or process.
# Check API server pod
kubectl get pods -n kube-system | grep apiserver
# Check API server systemd service (if running as a service)
# SSH to the control plane node
systemctl status kube-apiserver
journalctl -u kube-apiserver --tail=50
8. Fix the Binding Address
If the application is binding to localhost, update the application configuration or container command to bind to all interfaces.
# Common fix patterns for different applications:
# For Node.js: change server.listen(8080, '127.0.0.1') to server.listen(8080, '0.0.0.0')
# For Python Flask: change app.run(host='127.0.0.1') to app.run(host='0.0.0.0')
# For Java: change ServerSocket(8080, 50, InetAddress.getByName("localhost")) to ServerSocket(8080)
# You can also override via environment variable or command args in the pod spec
kubectl set env deployment/<deployment-name> HOST=0.0.0.0
9. Test Connectivity After Fixes
Verify the connection works end-to-end.
# Test direct pod connectivity
kubectl run conn-test --image=busybox --restart=Never --rm -it -- wget -qO- --timeout=5 http://<pod-ip>:<port>/
# Test through the Service
kubectl run conn-test --image=busybox --restart=Never --rm -it -- wget -qO- --timeout=5 http://<service-name>:<service-port>/
10. Verify Resolution
Confirm the error is resolved and connections are stable.
# Check pod is running and ready
kubectl get pod <target-pod> -o wide
# Verify the application is listening correctly
kubectl exec <target-pod> -- ss -tlnp
# Check service endpoints are populated
kubectl get endpoints <service-name>
# Run a sustained connectivity test
kubectl run conn-verify --image=busybox --restart=Never --rm -it -- sh -c 'for i in $(seq 1 10); do wget -qO- --timeout=2 http://<service-name>:<port>/ && echo "OK $i" || echo "FAIL $i"; sleep 1; done'
Consistent successful connections confirm the issue is resolved. If connections fail intermittently, some pods may be healthy while others are not — check readiness across all replicas.
How to Explain This in an Interview
I would explain the difference between connection refused and connection timeout — refused means a TCP RST was received, indicating the target is reachable but not accepting connections on that port, while timeout means the target is unreachable or packets are being dropped. I'd walk through the layers where this can happen: the application binding address, the container port configuration, the Service targetPort, and network-level reachability. I'd emphasize checking what address and port the application binds to inside the container, using netstat or ss, and verifying that the container port matches the Service targetPort.
Prevention
- Configure applications to listen on 0.0.0.0 rather than 127.0.0.1
- Use startup probes for applications with long initialization times
- Document and verify port conventions in container specifications
- Test connectivity as part of CI/CD deployment pipelines
- Match containerPort declarations with actual application binding ports