Kubernetes Exit Code 139

Causes and Fixes

Exit code 139 means the container process received SIGSEGV (signal 11, segmentation fault). The formula is 128 + 11 = 139. A segmentation fault occurs when the process tries to access memory it is not allowed to, typically due to a bug in the application code, a corrupt binary, or incompatible native libraries.

Symptoms

  • Pod shows Error or CrashLoopBackOff with exit code 139
  • kubectl describe pod shows exit code 139 in terminated state
  • Container logs may be empty or show partial output before the crash
  • Core dump may be generated if configured
  • The crash may be intermittent and hard to reproduce

Common Causes

1
Bug in native code
A null pointer dereference, buffer overflow, or use-after-free in C/C++ code or a native extension causes a segfault.
2
Incompatible shared libraries
The binary was compiled against a different version of a shared library than what is installed in the container. Common with glibc vs musl (Alpine).
3
Corrupt binary or memory
The binary file is corrupt (bad download, truncated image layer) or the node has faulty RAM.
4
Stack overflow
Deep recursion or large stack allocations cause the process to exceed its stack size limit.
5
Architecture mismatch
Running an x86 binary on ARM or vice versa can cause segfaults in addition to exec format errors.

Step-by-Step Troubleshooting

1. Confirm the Exit Code

kubectl describe pod <pod-name>

Look for:

Last State:     Terminated
  Reason:       Error
  Exit Code:    139

2. Check Container Logs

Segfaults often produce no application-level logs because the crash is abrupt.

kubectl logs <pod-name> --previous --timestamps

You might see partial output that ends abruptly, or you might see a message like:

Segmentation fault (core dumped)

3. Check for Library Compatibility Issues

This is the most common cause of segfaults in containerized applications.

# Run the image with a shell
kubectl run debug --image=<image> --restart=Never --command -- sleep 3600
kubectl exec -it debug -- sh

# Check shared library dependencies
ldd /app/binary

# Look for "not found" libraries
ldd /app/binary | grep "not found"

# Check the base image C library
ls -la /lib/ld-*
# Alpine uses musl: /lib/ld-musl-x86_64.so.1
# Debian/Ubuntu uses glibc: /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2

If the binary was built on a glibc-based system but the runtime image uses Alpine (musl), segfaults are likely.

Fix: Use the same base image family

# Build and run on the same family
FROM debian:bookworm-slim AS builder
# ... build here ...

FROM debian:bookworm-slim
COPY --from=builder /app/binary /app/binary

Fix: Static linking

# For Go
CGO_ENABLED=0 go build -o /app/binary .

# For C/C++
gcc -static -o /app/binary main.c

4. Check for Architecture Mismatch

# Check node architecture
kubectl get node <node-name> -o jsonpath='{.status.nodeInfo.architecture}'

# Check binary architecture
kubectl exec -it debug -- file /app/binary

If the binary is ELF 64-bit LSB executable, x86-64 but the node is arm64, this can cause segfaults.

5. Enable Core Dumps

Configure the container to generate core dumps for detailed analysis.

spec:
  containers:
    - name: app
      image: myapp:v1
      command: ["/bin/sh", "-c"]
      args:
        - |
          ulimit -c unlimited
          echo '/tmp/core.%p' > /proc/sys/kernel/core_pattern
          exec /app/binary
      volumeMounts:
        - name: coredumps
          mountPath: /tmp
  volumes:
    - name: coredumps
      emptyDir:
        sizeLimit: "1Gi"

After a crash, copy the core dump for analysis:

kubectl cp <pod-name>:/tmp/core.<pid> ./core.dump

# Analyze with gdb
gdb /app/binary ./core.dump
(gdb) bt  # Print backtrace

6. Check for Node Hardware Issues

If segfaults happen to multiple different applications on the same node, the node may have faulty RAM.

# Check which node
kubectl get pod <pod-name> -o jsonpath='{.spec.nodeName}'

# Check kernel messages for hardware errors
kubectl debug node/<node-name> -it --image=ubuntu -- bash -c "dmesg | grep -iE 'hardware|mce|error|memory'"

# Check if other pods on the same node are also segfaulting
kubectl get pods --field-selector spec.nodeName=<node-name> -o wide

If the node has hardware issues, cordon and drain it:

kubectl cordon <node-name>
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data

7. Debug with Address Sanitizer

If you can rebuild the application, use address sanitizer to detect the exact memory bug.

# For C/C++
FROM gcc:13
COPY . /app
WORKDIR /app
RUN gcc -fsanitize=address -g -o binary main.c
CMD ["./binary"]
# For Go
CGO_ENABLED=1 go build -race -o /app/binary .

ASAN will print detailed information about the memory violation, including the source file and line number.

8. Check for Stack Overflow

If the application uses deep recursion:

# Check the current stack size limit
kubectl exec <pod-name> -- ulimit -s
# Default is usually 8192 (8MB)

# Increase the stack size
spec:
  containers:
    - name: app
      command: ["/bin/sh", "-c"]
      args: ["ulimit -s unlimited && exec /app/binary"]

9. Test on a Different Node

Isolate whether the issue is node-specific or application-specific.

# Run on a specific different node
kubectl run test-pod --image=<image> --restart=Never \
  --overrides='{"spec":{"nodeName":"<different-node>"}}'

kubectl get pod test-pod -w

10. Verify the Fix

# Deploy the fixed image
kubectl set image deployment/<deploy-name> <container>=<fixed-image>

# Watch for stability
kubectl get pods -w

# Verify no segfaults
kubectl describe pod <pod-name> | grep "Exit Code"

# Check logs for successful operation
kubectl logs <pod-name> --tail=20

The application should run without segmentation faults.

How to Explain This in an Interview

I would explain that exit code 139 = 128 + 11 = SIGSEGV, which means a segmentation fault. This is typically a code-level bug in C/C++ or native library code, not a Kubernetes configuration issue. Debugging requires examining core dumps, checking library compatibility, and verifying the binary was built for the correct architecture. I would mention that segfaults in higher-level languages (Python, Java, Go) usually indicate a bug in native extensions or FFI calls rather than the managed code itself.

Prevention

  • Run address sanitizer (ASAN) in CI to catch memory bugs
  • Use the same base image for building and running native code
  • Pin shared library versions in the Dockerfile
  • Build multi-arch images for mixed-architecture clusters
  • Run memory testing tools (memtest86) on nodes with frequent segfaults

Related Errors