How Do You Troubleshoot RBAC Permission Denials in Kubernetes?
RBAC troubleshooting follows a systematic approach: verify the denial with kubectl auth can-i, check existing bindings and roles, inspect audit logs for the exact request, and fix the gap by creating or updating the appropriate Role and Binding.
Detailed Answer
When a user or ServiceAccount gets a 403 Forbidden response from the Kubernetes API server, the cause is almost always a missing or misconfigured RBAC binding. Here is a systematic approach to diagnose and fix the issue.
Step 1: Reproduce and Confirm the Denial
# Confirm the specific denial
kubectl auth can-i create deployments --as=jane -n production
# no
# Get more detail — list all permissions the subject has
kubectl auth can-i --list --as=jane -n production
If --list shows the permission you expect, the issue may not be RBAC. Check admission controllers or webhook configurations.
Step 2: Identify the Subject's Identity
A common cause of RBAC failures is an identity mismatch. The user or ServiceAccount name in the binding does not match what the API server sees.
# Check your current identity
kubectl auth whoami # Kubernetes 1.27+
# For older versions, check the kubeconfig context
kubectl config view --minify -o jsonpath='{.contexts[0].context.user}'
# For ServiceAccounts, check the Pod spec
kubectl get pod my-pod -n production \
-o jsonpath='{.spec.serviceAccountName}'
For certificate-based auth, the username is the certificate's Common Name (CN) and groups come from the Organization (O) field:
# Inspect a client certificate
openssl x509 -in client.crt -noout -subject
# subject=O = dev-team, CN = jane
If the certificate says CN = jane but the RoleBinding references jane.doe, the binding will not match.
Step 3: Check Existing Bindings
# Check namespace-scoped bindings
kubectl get rolebindings -n production -o wide
# Check cluster-scoped bindings
kubectl get clusterrolebindings -o wide | grep jane
# Detailed view of a specific binding
kubectl describe rolebinding dev-access -n production
Look for:
- Subject mismatch: Does the binding reference the correct user/group/SA name?
- Namespace mismatch: Is the RoleBinding in the correct namespace?
- Role reference: Does the roleRef point to a Role that actually exists?
Step 4: Inspect the Referenced Role
# Check the Role's rules
kubectl describe role deployment-manager -n production
# Or for ClusterRoles
kubectl describe clusterrole deployment-manager
Verify that the Role includes:
- The correct apiGroup (empty string
""for core resources,appsfor Deployments, etc.) - The correct resource name (including subresources like
pods/log) - The correct verbs (get, list, create, update, patch, delete, watch)
Step 5: Check Audit Logs
API audit logs provide the definitive record. Look for 403 responses:
# On a kubeadm cluster, audit logs are typically at:
# /var/log/kubernetes/audit/audit.log
# Search for denials for a specific user
grep '"user":{"username":"jane"' /var/log/kubernetes/audit/audit.log | \
grep '"code":403'
A typical audit log entry for a denial:
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "Metadata",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/production/secrets",
"verb": "list",
"user": {
"username": "jane",
"groups": ["dev-team", "system:authenticated"]
},
"responseStatus": {
"code": 403,
"reason": "Forbidden"
},
"objectRef": {
"resource": "secrets",
"namespace": "production",
"apiGroup": "",
"apiVersion": "v1"
}
}
This tells you exactly: user jane tried to list secrets in production and was denied.
Common Root Causes and Fixes
1. Wrong apiGroup in the Role
# WRONG — Deployments are in the "apps" group, not core
rules:
- apiGroups: [""]
resources: ["deployments"]
verbs: ["get", "list"]
# CORRECT
rules:
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list"]
2. Missing subresource
# User can get pods but cannot view logs
# MISSING: pods/log subresource
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]
# FIX: add the subresource
rules:
- apiGroups: [""]
resources: ["pods", "pods/log"]
verbs: ["get", "list"]
3. RoleBinding in the wrong namespace
# The user needs access to 'production' but the binding is in 'staging'
kubectl get rolebinding dev-access -n staging
# Found! But it's in the wrong namespace.
# Fix: create the binding in the correct namespace
kubectl create rolebinding dev-access \
--role=developer-role \
--user=jane \
-n production
4. Role does not exist
# The binding references a Role that was deleted or never created
kubectl describe rolebinding dev-access -n production
# Role: developer-role
kubectl get role developer-role -n production
# Error from server (NotFound)
5. ServiceAccount namespace mismatch in binding subjects
# WRONG — namespace field missing or incorrect
subjects:
- kind: ServiceAccount
name: deployer
# CORRECT — namespace is required for ServiceAccount subjects
subjects:
- kind: ServiceAccount
name: deployer
namespace: production
Troubleshooting Decision Tree
403 Forbidden
├── kubectl auth can-i confirms denial?
│ ├── YES → RBAC issue
│ │ ├── Binding exists?
│ │ │ ├── NO → Create the binding
│ │ │ └── YES → Check:
│ │ │ ├── Subject name matches identity?
│ │ │ ├── Binding in correct namespace?
│ │ │ ├── Referenced Role exists?
│ │ │ ├── Role has correct apiGroup?
│ │ │ ├── Role has correct resource (incl. subresource)?
│ │ │ └── Role has correct verb?
│ │ └── Fix the identified gap
│ └── NO (can-i says "yes" but request fails)
│ ├── Check admission webhooks
│ ├── Check PodSecurityAdmission
│ └── Check OPA/Gatekeeper policies
Useful Diagnostic Commands Reference
# Full diagnostic for a subject in a namespace
SUBJECT="jane"
NS="production"
echo "=== Permissions ==="
kubectl auth can-i --list --as="$SUBJECT" -n "$NS"
echo "=== RoleBindings ==="
kubectl get rolebindings -n "$NS" -o json | \
jq -r ".items[] | select(.subjects[]?.name==\"$SUBJECT\") | .metadata.name"
echo "=== ClusterRoleBindings ==="
kubectl get clusterrolebindings -o json | \
jq -r ".items[] | select(.subjects[]?.name==\"$SUBJECT\") | .metadata.name"
Why Interviewers Ask This
RBAC permission errors are among the most common issues in Kubernetes. Interviewers want to see a structured debugging methodology, not guesswork. This question separates operators who have real cluster experience from those who only know theory.
Common Follow-Up Questions
Key Takeaways
- Always start with kubectl auth can-i to confirm the denial.
- Check both RoleBindings and ClusterRoleBindings — permissions can come from either.
- API audit logs provide the definitive record of what was denied and why.
- Common causes include wrong namespace, wrong apiGroup, missing subresource, and identity mismatch.