What Are Security Contexts in Kubernetes?

intermediate|securitydevopssreCKA
TL;DR

Security contexts define privilege and access control settings for Pods and containers. They control the user and group IDs a process runs as, whether privilege escalation is allowed, Linux capabilities, read-only root filesystems, and seccomp/AppArmor profiles.

Detailed Answer

Pod-Level Security Context

The spec.securityContext field applies settings to all containers in the Pod:

apiVersion: v1
kind: Pod
metadata:
  name: secure-pod
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    runAsNonRoot: true
    seccompProfile:
      type: RuntimeDefault
  containers:
    - name: app
      image: myapp:latest

Key Pod-level fields:

| Field | Purpose | |---|---| | runAsUser | UID for all container processes | | runAsGroup | Primary GID for all container processes | | fsGroup | GID applied to all mounted volumes | | runAsNonRoot | Reject the Pod if the container would run as root | | seccompProfile | Seccomp profile for all containers | | supplementalGroups | Additional GIDs for the container processes |

Container-Level Security Context

The securityContext field on individual containers provides finer control and overrides Pod-level settings:

apiVersion: v1
kind: Pod
metadata:
  name: hardened-pod
spec:
  securityContext:
    runAsNonRoot: true
    fsGroup: 1000
    seccompProfile:
      type: RuntimeDefault
  containers:
    - name: app
      image: myapp:latest
      securityContext:
        runAsUser: 1000
        runAsGroup: 1000
        allowPrivilegeEscalation: false
        readOnlyRootFilesystem: true
        capabilities:
          drop:
            - ALL
          add:
            - NET_BIND_SERVICE

Key Security Settings Explained

runAsNonRoot

Ensures the container does not run as UID 0 (root). The kubelet validates the container image's user at runtime:

securityContext:
  runAsNonRoot: true
  runAsUser: 65534  # 'nobody' user

If the image is configured to run as root and runAsNonRoot: true is set without specifying runAsUser, the Pod will fail to start.

allowPrivilegeEscalation

Controls whether a process can gain more privileges than its parent:

securityContext:
  allowPrivilegeEscalation: false

This sets the Linux no_new_privs flag. It prevents setuid binaries from escalating privileges and is essential for defense in depth.

readOnlyRootFilesystem

Makes the container's root filesystem read-only, preventing writes to the container image layers:

securityContext:
  readOnlyRootFilesystem: true

Applications that need to write temporary files should use emptyDir volumes:

containers:
  - name: app
    image: myapp:latest
    securityContext:
      readOnlyRootFilesystem: true
    volumeMounts:
      - name: tmp
        mountPath: /tmp
      - name: cache
        mountPath: /var/cache
volumes:
  - name: tmp
    emptyDir: {}
  - name: cache
    emptyDir: {}

Linux Capabilities

Linux capabilities break the monolithic root privilege into discrete units. By default, containers get a set of capabilities. Best practice is to drop all and add only what is needed:

securityContext:
  capabilities:
    drop:
      - ALL
    add:
      - NET_BIND_SERVICE  # Bind to ports < 1024

Common capabilities:

| Capability | Purpose | |---|---| | NET_BIND_SERVICE | Bind to privileged ports (< 1024) | | NET_RAW | Use raw sockets (needed for ping) | | SYS_PTRACE | Trace processes (debugging) | | SYS_ADMIN | Broad system administration (avoid in production) | | CHOWN | Change file ownership |

Seccomp Profiles

Seccomp (secure computing mode) restricts which system calls a container can make:

securityContext:
  seccompProfile:
    type: RuntimeDefault  # Use the container runtime's default profile

Options:

  • RuntimeDefault: Uses the CRI's default seccomp profile (blocks ~50 dangerous syscalls)
  • Localhost: Uses a custom profile from the node's seccomp profile directory
  • Unconfined: No seccomp filtering (not recommended)
# Custom seccomp profile
securityContext:
  seccompProfile:
    type: Localhost
    localhostProfile: profiles/my-app-seccomp.json

A Fully Hardened Pod Example

apiVersion: v1
kind: Pod
metadata:
  name: production-app
spec:
  automountServiceAccountToken: false
  securityContext:
    runAsNonRoot: true
    runAsUser: 10000
    runAsGroup: 10000
    fsGroup: 10000
    seccompProfile:
      type: RuntimeDefault
  containers:
    - name: app
      image: myapp:v2.1.0
      ports:
        - containerPort: 8080
      securityContext:
        allowPrivilegeEscalation: false
        readOnlyRootFilesystem: true
        capabilities:
          drop:
            - ALL
      resources:
        requests:
          cpu: 100m
          memory: 128Mi
        limits:
          cpu: 500m
          memory: 256Mi
      volumeMounts:
        - name: tmp
          mountPath: /tmp
  volumes:
    - name: tmp
      emptyDir:
        sizeLimit: 100Mi

Verifying Security Contexts

# Check what user a container is running as
kubectl exec production-app -- id
# uid=10000 gid=10000 groups=10000

# Check capabilities
kubectl exec production-app -- cat /proc/1/status | grep Cap

# Verify read-only filesystem
kubectl exec production-app -- touch /test-file
# touch: /test-file: Read-only file system

# Check seccomp status
kubectl exec production-app -- cat /proc/1/status | grep Seccomp
# Seccomp: 2 (filter mode)

Common Pitfalls

Many container images run as root by default. When you add runAsNonRoot: true, these images fail to start. You need to either rebuild the image with a non-root user or specify runAsUser in the security context. Always test security context changes in a staging environment before applying to production.

Why Interviewers Ask This

Interviewers ask this to determine if you can harden container workloads and apply the principle of least privilege in production Kubernetes environments.

Common Follow-Up Questions

What is the difference between Pod-level and container-level security contexts?
Pod-level settings apply to all containers and init containers. Container-level settings override Pod-level settings for that specific container.
What does allowPrivilegeEscalation: false do?
It prevents the container process from gaining more privileges than its parent. It sets the no_new_privs flag in the Linux kernel.
Why should you drop all capabilities and add only what is needed?
Containers inherit a default set of capabilities. Dropping all and adding specific ones reduces the attack surface to the minimum required.

Key Takeaways

  • Always run containers as non-root with allowPrivilegeEscalation: false
  • Drop ALL capabilities and add only what is needed
  • Use readOnlyRootFilesystem to prevent writes to the container image layers