Kubernetes Security Best Practices — A Comprehensive Guide
Security Is Not Optional
Kubernetes clusters are high-value targets. They run your critical workloads, store your secrets, and often have broad network access. A single misconfigured RBAC binding or an overly permissive pod can give an attacker access to your entire infrastructure.
This guide covers the security controls that matter most, with configurations you can apply to real clusters.
RBAC Fundamentals
Role-Based Access Control is how you control who can do what in your cluster. RBAC has four building blocks:
- Role: Grants permissions within a single namespace
- ClusterRole: Grants permissions cluster-wide
- RoleBinding: Connects a Role to a user/group/service account within a namespace
- ClusterRoleBinding: Connects a ClusterRole to a user/group/service account cluster-wide
The Principle of Least Privilege in Practice
Start with zero permissions and add only what's needed. Here's an RBAC setup for a developer who needs to view and manage deployments in their team's namespace, but nothing else:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: developer
namespace: team-alpha
rules:
# Can manage deployments and view their status
- apiGroups: ["apps"]
resources: ["deployments", "replicasets"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
# Can view pods and their logs for debugging
- apiGroups: [""]
resources: ["pods", "pods/log"]
verbs: ["get", "list", "watch"]
# Can view services and configmaps
- apiGroups: [""]
resources: ["services", "configmaps"]
verbs: ["get", "list", "watch"]
# Cannot delete anything, cannot access secrets, cannot exec into pods
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: developer-binding
namespace: team-alpha
subjects:
- kind: User
name: alice@company.com
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: developer
apiGroup: rbac.authorization.k8s.io
Notice what's not included: no delete verb, no access to secrets, no pods/exec. Each of these would need to be explicitly granted.
Auditing RBAC
Regularly review who has access to what:
# What can a specific user do?
kubectl auth can-i --list --as=alice@company.com -n team-alpha
# Can this service account create pods? (common privilege escalation vector)
kubectl auth can-i create pods --as=system:serviceaccount:default:my-sa
# Find all ClusterRoleBindings — these are the most dangerous
kubectl get clusterrolebindings -o custom-columns=\
NAME:.metadata.name,\
ROLE:.roleRef.name,\
SUBJECTS:.subjects[*].name
# Find anything bound to cluster-admin (the superuser role)
kubectl get clusterrolebindings -o json | \
jq '.items[] | select(.roleRef.name=="cluster-admin") | .subjects[]'
Common RBAC Mistakes
Granting cluster-admin to service accounts: This is the most dangerous mistake. If a pod using that service account is compromised, the attacker owns the cluster.
Using the default service account: Every namespace has a default service account. If you don't specify one, pods use it. If you've granted permissions to default, every pod in that namespace inherits them.
# Always create dedicated service accounts
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-app
namespace: production
automountServiceAccountToken: false # Don't mount token unless needed
Granting escalate or bind verbs: These allow a user to create role bindings with more permissions than they have. This is equivalent to admin access.
Wildcards in RBAC rules: Never use * for verbs or resources in production:
# DANGEROUS — grants everything
rules:
- apiGroups: ["*"]
resources: ["*"]
verbs: ["*"]
Pod Security Standards
Pod Security Standards (PSS) replace the deprecated PodSecurityPolicy. They define three levels:
- Privileged: Unrestricted. For system-level workloads that need full host access.
- Baseline: Prevents known privilege escalations. Blocks hostNetwork, hostPID, privileged containers, but allows running as root.
- Restricted: Heavily restricted. Requires non-root, read-only root filesystem, drops all capabilities.
Apply them per-namespace using labels:
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
# Enforce restricted standard — reject non-compliant pods
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: latest
# Warn about baseline violations in staging-like namespaces
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/audit: restricted
A pod that meets the restricted standard:
apiVersion: v1
kind: Pod
metadata:
name: secure-app
namespace: production
spec:
securityContext:
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: myregistry.com/myapp:v1.2.3@sha256:abc123...
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsUser: 1000
runAsGroup: 1000
capabilities:
drop:
- ALL
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir: {}
serviceAccountName: my-app
automountServiceAccountToken: false
Every field in this spec is intentional:
runAsNonRoot: true— prevents containers from running as UID 0allowPrivilegeEscalation: false— preventssetuidbinaries from escalating privilegesreadOnlyRootFilesystem: true— stops attackers from writing malware to the filesystemcapabilities.drop: ALL— removes all Linux capabilities (the default set includesNET_RAW,MKNOD, etc.)seccompProfile: RuntimeDefault— applies the default seccomp profile that blocks dangerous syscalls- Image with digest — immutable reference, can't be swapped by a registry compromise
automountServiceAccountToken: false— no Kubernetes API access from within the pod
Security Contexts In Depth
Security contexts configure privilege and access control at the pod and container level. Understanding the specific fields matters for both security and interviews.
Linux Capabilities
Capabilities break root privilege into discrete units. Instead of a binary root/non-root distinction, you can grant specific powers:
securityContext:
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE # Bind to ports below 1024
Common capabilities and when you might need them:
| Capability | Purpose | Risk |
|---|---|---|
| NET_BIND_SERVICE | Bind to ports < 1024 | Low — limited scope |
| NET_RAW | Create raw sockets (ping) | Medium — enables packet spoofing |
| SYS_PTRACE | Trace processes | High — can read process memory |
| SYS_ADMIN | Broad sysadmin operations | Critical — essentially root |
The default container capability set includes about 14 capabilities. Dropping ALL and adding back only what you need is the safest approach.
User and Group IDs
spec:
securityContext:
runAsUser: 1000 # UID for all containers
runAsGroup: 1000 # Primary GID
fsGroup: 2000 # GID for volume mounts — files created in volumes get this group
containers:
- name: app
securityContext:
runAsUser: 1001 # Container-level override
The fsGroup field is particularly important for persistent volumes. Without it, your non-root container might not be able to write to mounted volumes because the filesystem ownership doesn't match.
Read-Only Root Filesystem
Forcing a read-only root filesystem prevents many attack techniques (writing web shells, modifying binaries, dumping data to disk). But applications often need to write to certain paths:
containers:
- name: app
securityContext:
readOnlyRootFilesystem: true
volumeMounts:
- name: tmp
mountPath: /tmp
- name: cache
mountPath: /app/cache
volumes:
- name: tmp
emptyDir:
sizeLimit: 100Mi
- name: cache
emptyDir:
sizeLimit: 500Mi
Use emptyDir volumes for writable directories. They're pod-scoped and disappear when the pod is deleted.
Secrets Management
Kubernetes Secrets are base64-encoded, not encrypted. By default, they're stored unencrypted in etcd. This is the baseline you need to improve on.
Encryption at Rest
Configure the API server to encrypt Secrets in etcd:
# /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: <base64-encoded-32-byte-key>
- identity: {} # fallback for reading unencrypted secrets
Pass this to the API server: --encryption-provider-config=/etc/kubernetes/encryption-config.yaml
After enabling, re-encrypt existing secrets:
# Force re-encryption of all secrets
kubectl get secrets --all-namespaces -o json | kubectl replace -f -
External Secret Stores
For production workloads, store secrets outside the cluster and sync them in:
External Secrets Operator (ESO) syncs secrets from AWS Secrets Manager, HashiCorp Vault, GCP Secret Manager, Azure Key Vault, and others:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: db-credentials
namespace: production
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: db-credentials # Kubernetes Secret name
creationPolicy: Owner
data:
- secretKey: username
remoteRef:
key: production/database
property: username
- secretKey: password
remoteRef:
key: production/database
property: password
This creates a regular Kubernetes Secret that's automatically synced from your external store. If the external secret rotates, the Kubernetes Secret updates within the refresh interval.
Secrets Hygiene
# Never create secrets from the command line — they end up in shell history
# BAD:
kubectl create secret generic db-pass --from-literal=password=s3cret
# BETTER: Use a file
echo -n 's3cret' > /tmp/password.txt
kubectl create secret generic db-pass --from-file=password=/tmp/password.txt
rm /tmp/password.txt
# BEST: Use an external secrets operator or sealed-secrets for GitOps
Avoid mounting secrets as environment variables when possible. Environment variables are visible in process listings, logged by crash reporters, and inherited by child processes. Volume mounts are safer:
containers:
- name: app
volumeMounts:
- name: db-credentials
mountPath: /etc/secrets/db
readOnly: true
volumes:
- name: db-credentials
secret:
secretName: db-credentials
defaultMode: 0400 # Read-only, owner only
Network Policies for Security
Network Policies are your in-cluster firewall. Without them, any compromised pod can reach any other pod, including your database.
Zero-Trust Networking Pattern
Implement a zero-trust model: deny everything, then allow specific flows.
# 1. Deny all traffic in the namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
# 2. Allow DNS (required for service discovery)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
---
# 3. Allow frontend to reach backend API
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-allow-frontend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
---
# 4. Allow backend to reach database
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: database-allow-backend
namespace: production
spec:
podSelector:
matchLabels:
app: database
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: backend
ports:
- protocol: TCP
port: 5432
This creates a strict traffic flow: frontend → backend → database. The frontend can't bypass the backend to reach the database directly.
Testing Network Policies
Always verify your policies work:
# Deploy a test pod
kubectl run test-client --rm -it --image=busybox -n production -- /bin/sh
# Test allowed path (should work)
wget -qO- --timeout=3 http://backend:8080/health
# Test blocked path (should timeout)
wget -qO- --timeout=3 http://database:5432
# wget: download timed out
Admission Controllers
Admission controllers intercept API requests after authentication and authorization, but before persistence. They can mutate requests (change fields) or validate them (reject non-compliant ones).
Built-in Admission Controllers
Key built-in controllers to be aware of:
- LimitRanger: Enforces default and maximum resource limits per namespace
- ResourceQuota: Prevents namespaces from consuming more than their share
- PodSecurity: Enforces Pod Security Standards (replaced PodSecurityPolicy)
- NodeRestriction: Limits what kubelets can modify (prevents a compromised node from affecting others)
- AlwaysPullImages: Forces image pulls on every pod start (prevents using cached images that might have been tampered with)
Custom Admission Webhooks
For policies specific to your organization, use validating and mutating webhooks. Tools like Kyverno and OPA Gatekeeper make this approachable:
Kyverno example — require all pods to have resource limits:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-resource-limits
spec:
validationFailureAction: Enforce
rules:
- name: check-limits
match:
any:
- resources:
kinds:
- Pod
validate:
message: "All containers must have CPU and memory limits set."
pattern:
spec:
containers:
- resources:
limits:
memory: "?*"
cpu: "?*"
Kyverno example — automatically add a security label to every new namespace:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: add-security-labels
spec:
rules:
- name: add-pod-security
match:
any:
- resources:
kinds:
- Namespace
mutate:
patchStrategicMerge:
metadata:
labels:
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/warn: restricted
API Server Hardening
The API server is the most critical attack surface. Here's how to lock it down.
Authentication
- Disable anonymous access in production:
--anonymous-auth=false - Use OIDC for human users instead of client certificates. Certificates can't be easily revoked.
- Short-lived tokens: Use bound service account tokens (the default since Kubernetes 1.24) which are audience-bound and time-limited.
# Check if anonymous auth is enabled
kubectl get pod kube-apiserver-* -n kube-system -o yaml | grep anonymous
Authorization
- Always use RBAC mode:
--authorization-mode=Node,RBAC - Never use ABAC in production — it requires API server restarts for policy changes.
- Disable the system:anonymous ClusterRoleBinding if it exists.
Secure Endpoints
# The API server should only be accessible on the secure port (6443)
# Ensure the insecure port is disabled (it is by default since 1.20)
# --insecure-port=0
# Restrict which IPs can reach the API server
# Use firewall rules or --advertise-address and network-level controls
Rate Limiting
Protect the API server from abuse:
# API Priority and Fairness (APF) configuration
apiVersion: flowcontrol.apiserver.k8s.io/v1
kind: FlowSchema
metadata:
name: restrict-list-all
spec:
priorityLevelConfiguration:
name: low-priority
matchingPrecedence: 500
rules:
- subjects:
- kind: Group
group:
name: developers
resourceRules:
- verbs: ["list"]
apiGroups: ["*"]
resources: ["*"]
namespaces: ["*"]
Audit Logging
Kubernetes audit logs record every API request. They're essential for incident response, compliance, and understanding who did what.
Audit Policy
Define what to log and at what detail level:
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all requests to secrets at the Metadata level (who accessed, not contents)
- level: Metadata
resources:
- group: ""
resources: ["secrets"]
# Log all changes to RBAC at RequestResponse level (full detail)
- level: RequestResponse
resources:
- group: "rbac.authorization.k8s.io"
resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]
# Log pod exec/attach at Request level
- level: Request
resources:
- group: ""
resources: ["pods/exec", "pods/attach"]
# Don't log read-only requests to endpoints or health checks
- level: None
resources:
- group: ""
resources: ["endpoints", "services"]
verbs: ["get", "list", "watch"]
# Don't log kubelet or system health checks
- level: None
users: ["system:kube-proxy", "kubelet"]
verbs: ["get"]
resources:
- group: ""
resources: ["nodes", "nodes/status"]
# Default: log everything else at Metadata level
- level: Metadata
The four audit levels:
| Level | What's captured |
|---|---|
| None | Nothing logged |
| Metadata | Who requested what, when, and the response code |
| Request | Metadata + the request body |
| RequestResponse | Metadata + request body + response body |
Configuring the Audit Backend
# File-based audit log
# Add to kube-apiserver flags:
# --audit-policy-file=/etc/kubernetes/audit-policy.yaml
# --audit-log-path=/var/log/kubernetes/audit.log
# --audit-log-maxage=30
# --audit-log-maxbackup=10
# --audit-log-maxsize=100
# Search audit logs for suspicious activity
# Who deleted pods in production?
cat /var/log/kubernetes/audit.log | \
jq 'select(.verb=="delete" and .objectRef.resource=="pods" and .objectRef.namespace=="production")'
# Who accessed secrets?
cat /var/log/kubernetes/audit.log | \
jq 'select(.objectRef.resource=="secrets") | {user: .user.username, verb: .verb, secret: .objectRef.name, time: .requestReceivedTimestamp}'
For production, send audit logs to a SIEM (Splunk, Elasticsearch, etc.) using a webhook backend or a log collector like Fluentd.
Supply Chain Security
Securing the software supply chain prevents compromised or tampered images from running in your cluster.
Image Signing and Verification
Use cosign (from the Sigstore project) to sign and verify container images:
# Sign an image after building
cosign sign --key cosign.key myregistry.com/myapp:v1.2.3
# Verify before deploying
cosign verify --key cosign.pub myregistry.com/myapp:v1.2.3
Enforce verification in-cluster with a policy engine:
# Kyverno policy: only allow signed images
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: verify-image-signatures
spec:
validationFailureAction: Enforce
rules:
- name: check-signature
match:
any:
- resources:
kinds:
- Pod
verifyImages:
- imageReferences:
- "myregistry.com/*"
attestors:
- entries:
- keys:
publicKeys: |-
-----BEGIN PUBLIC KEY-----
...
-----END PUBLIC KEY-----
Image Provenance
Use image digests instead of tags. Tags are mutable — someone can push a different image to the same tag. Digests are immutable:
# BAD: Tag can be overwritten
image: myapp:latest
# BETTER: Specific version tag
image: myapp:v1.2.3
# BEST: Digest — cryptographically pinned
image: myapp:v1.2.3@sha256:a1b2c3d4e5f6...
Vulnerability Scanning
Integrate scanning into your CI/CD pipeline and cluster:
# Scan an image with trivy before deployment
trivy image myregistry.com/myapp:v1.2.3
# Scan for critical vulnerabilities only
trivy image --severity CRITICAL myregistry.com/myapp:v1.2.3
# Scan running workloads
trivy k8s --report=summary cluster
Private Registries
Don't pull images from public registries in production. Mirror approved images into a private registry and restrict image sources:
# Kyverno policy: only allow images from approved registries
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: restrict-image-registries
spec:
validationFailureAction: Enforce
rules:
- name: validate-registries
match:
any:
- resources:
kinds:
- Pod
validate:
message: "Images must come from myregistry.com or gcr.io/my-project"
pattern:
spec:
containers:
- image: "myregistry.com/* | gcr.io/my-project/*"
Security Checklist
Use this as a baseline assessment for any Kubernetes cluster:
Cluster Configuration
- [ ] RBAC is enabled and
system:anonymoushas no dangerous bindings - [ ] API server audit logging is configured and logs are shipped to a SIEM
- [ ] etcd is encrypted at rest and access is restricted to the API server
- [ ] The API server's insecure port is disabled
- [ ] Node-to-control-plane communication uses TLS
- [ ] kubelet authentication is enabled (
--anonymous-auth=falseon kubelet)
Workload Security
- [ ] Pod Security Standards are enforced (at least
baseline, preferablyrestricted) - [ ] All containers run as non-root
- [ ] All containers drop ALL capabilities and only add back what's needed
- [ ] Resource requests and limits are set on all containers
- [ ] Service accounts use
automountServiceAccountToken: falseunless needed - [ ] Images use digests or are verified with signatures
Network Security
- [ ] Default-deny Network Policies exist in all production namespaces
- [ ] DNS egress is explicitly allowed (not open egress to all)
- [ ] External access is controlled through Ingress with TLS
- [ ] The Kubernetes API is not publicly accessible
Secrets
- [ ] etcd encryption is enabled for Secrets
- [ ] Production secrets are managed through an external secret store
- [ ] Secrets are mounted as volumes, not environment variables
- [ ] Secret access is audited
Supply Chain
- [ ] Images come from private/approved registries only
- [ ] Vulnerability scanning runs in CI/CD and on running clusters
- [ ] Image signing and verification is enforced
- [ ] Base images are regularly updated
This checklist isn't exhaustive, but covering these items puts you ahead of the vast majority of clusters in the wild. Security is layered — no single control is sufficient, but together they create a defense in depth that makes compromise significantly harder.