Kubernetes Exit Code 1

Causes and Fixes

Exit code 1 is the most common non-zero exit code and indicates a general application error. The container's main process encountered an unhandled exception, failed assertion, or explicit error exit. This is a catch-all code that applications use when something goes wrong during execution.

Symptoms

  • Pod status shows CrashLoopBackOff or Error
  • Container exit code is 1 in kubectl describe pod output
  • Application logs show errors, exceptions, or stack traces
  • Container restarts repeatedly with increasing backoff delay

Common Causes

1
Unhandled exception in application code
The application threw an exception that was not caught, causing the runtime to exit with code 1. Check application logs for the stack trace.
2
Missing or invalid configuration
The application cannot find required configuration files, environment variables, or has invalid settings. Verify ConfigMaps and Secrets.
3
Failed dependency connection
The application tries to connect to a database, message queue, or external service on startup and exits if the connection fails.
4
Invalid command-line arguments
The application was started with invalid flags or arguments. Check the command and args in the pod spec.
5
File not found or permission denied
The application tries to read a file that does not exist or cannot be accessed with the container's user permissions.

Step-by-Step Troubleshooting

1. Confirm the Exit Code

kubectl describe pod <pod-name>

Look for:

Last State:     Terminated
  Reason:       Error
  Exit Code:    1

2. Read the Application Logs

The most important step — the application should have logged why it exited.

# Get logs from the crashed container
kubectl logs <pod-name> --previous

# For multi-container pods
kubectl logs <pod-name> -c <container-name> --previous

# Get logs with timestamps to see the sequence of events
kubectl logs <pod-name> --previous --timestamps

# If you need more context, get all available logs
kubectl logs <pod-name> --previous --tail=-1

Common error patterns:

# Python
Traceback (most recent call last):
  File "app.py", line 42, in main
    connect_db()
ConnectionRefusedError: [Errno 111] Connection refused

# Java
Exception in thread "main" java.lang.RuntimeException: Failed to initialize
    at com.example.App.main(App.java:15)

# Node.js
Error: Cannot find module '/app/config.json'
    at Function.Module._resolveFilename (node:internal/modules/cjs/loader:933:15)

# Go
panic: runtime error: invalid memory address or nil pointer dereference

3. Check Environment Variables and Configuration

# List environment variables available to the container
kubectl exec <pod-name> -- env 2>/dev/null || kubectl set env pod/<pod-name> --list

# Check if referenced ConfigMaps exist
kubectl get configmap -n <namespace>

# Check if referenced Secrets exist
kubectl get secrets -n <namespace>

# Verify specific ConfigMap contents
kubectl get configmap <name> -o yaml

4. Check Dependency Connectivity

If the logs show connection failures, verify the dependencies are accessible.

# Run a debug pod in the same namespace
kubectl run debug --image=busybox --restart=Never --command -- sleep 3600

# Test database connectivity
kubectl exec -it debug -- nc -zv <db-host> <db-port>

# Test service DNS resolution
kubectl exec -it debug -- nslookup <service-name>.<namespace>.svc.cluster.local

# Test HTTP endpoint
kubectl exec -it debug -- wget -q -O - http://<service>:<port>/health

# Clean up
kubectl delete pod debug

5. Check File Mounts and Permissions

If the application expects files that may not be present:

# Check volume mounts
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[0].volumeMounts}' | jq .

# Exec into the container (if it stays up long enough)
kubectl exec <pod-name> -- ls -la /path/to/expected/file

# Check with a debug container
kubectl debug <pod-name> -it --image=busybox --target=<container-name> -- sh
ls -la /proc/1/root/path/to/expected/file

6. Reproduce Locally

Run the same image with the same configuration locally to get faster feedback.

# Pull the image
docker pull <image>

# Run with similar environment variables
docker run --rm \
  -e DATABASE_URL=postgres://localhost:5432/db \
  -e APP_CONFIG=/etc/app/config.yaml \
  <image>

This gives you full control to iterate quickly on the fix.

7. Debug with a Modified Pod

Override the container command to keep it alive for investigation.

# Run the image with sleep to inspect the environment
kubectl run debug-app --image=<image> --restart=Never --command -- sleep 3600

# Exec in and try running the application manually
kubectl exec -it debug-app -- sh

# Inside the container:
# Check files, environment, connectivity
env
ls -la /app/
cat /etc/app/config.yaml

# Try running the app
/app/start.sh
# Observe the error output

8. Check for Init Container Failures

If the main container depends on setup done by init containers, check if they completed successfully.

kubectl get pod <pod-name> -o jsonpath='{.status.initContainerStatuses}' | jq .

If an init container failed or created incomplete data, the main container may exit with code 1.

9. Apply the Fix

Based on your findings:

# Fix configuration
kubectl edit configmap <config-name>

# Fix environment variables
kubectl set env deployment/<deploy-name> DATABASE_URL=postgres://db:5432/mydb

# Fix the image (if it is a code bug)
kubectl set image deployment/<deploy-name> <container>=<fixed-image>

# Restart the deployment
kubectl rollout restart deployment/<deploy-name>

10. Verify the Fix

# Watch the pod
kubectl get pods -w

# Check logs for successful startup
kubectl logs <pod-name>

# Verify restart count is stable
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].restartCount}'

The container should start, log a successful initialization, and remain running.

How to Explain This in an Interview

I would explain that exit code 1 is an application-level error — the most generic error code that programs use. The debugging approach starts with reading the container logs (kubectl logs --previous), which usually contain the error message or stack trace. Unlike signal-based exit codes (137, 139, 143), exit code 1 is set by the application itself, so the fix is almost always in the application code or its configuration. I would emphasize the importance of structured logging and health check endpoints for faster diagnosis.

Prevention

  • Implement comprehensive error handling and logging in applications
  • Validate configuration at startup and provide clear error messages
  • Use init containers to verify dependencies before the main container starts
  • Test configuration in staging environments before production deployment
  • Use structured logging to make errors easy to search and alert on

Related Errors