How Does Kubernetes Audit Logging Work?

intermediate|securitydevopssreplatform engineerCKA
TL;DR

Kubernetes audit logging records all requests to the API server, providing a chronological record of who did what, when, and to which resources. Audit policies define which events are logged at what detail level, and logs can be sent to files or webhook backends.

Detailed Answer

Kubernetes audit logging provides a security-relevant, chronological record of every action taken against the Kubernetes API. It answers the questions: Who did what? When? On which resource? And what was the outcome?

Audit Event Structure

Every audit event contains:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "RequestResponse",
  "auditID": "unique-event-id",
  "stage": "ResponseComplete",
  "requestURI": "/api/v1/namespaces/production/pods",
  "verb": "create",
  "user": {
    "username": "jane@example.com",
    "groups": ["dev-team", "system:authenticated"],
    "uid": "jane-uid"
  },
  "sourceIPs": ["10.0.1.50"],
  "userAgent": "kubectl/v1.30.0",
  "objectRef": {
    "resource": "pods",
    "namespace": "production",
    "name": "my-pod",
    "apiVersion": "v1"
  },
  "responseStatus": {
    "metadata": {},
    "code": 201
  },
  "requestObject": { "..." },
  "responseObject": { "..." },
  "requestReceivedTimestamp": "2026-03-19T10:30:00.000000Z",
  "stageTimestamp": "2026-03-19T10:30:00.050000Z"
}

Audit Policy Design

A well-designed audit policy balances security needs with log volume:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  # Don't log requests to non-resource URLs (healthz, livez)
  - level: None
    nonResourceURLs:
      - /healthz*
      - /livez*
      - /readyz*
      - /metrics

  # Don't log watch requests (very high volume)
  - level: None
    verbs: ["watch"]

  # Don't log kube-proxy or system component noise
  - level: None
    users:
      - "system:kube-proxy"
      - "system:apiserver"
    verbs: ["get"]

  # Log RBAC changes with full request and response
  - level: RequestResponse
    resources:
      - group: "rbac.authorization.k8s.io"
        resources:
          - roles
          - rolebindings
          - clusterroles
          - clusterrolebindings

  # Log secret access (metadata only — never log secret values)
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets"]

  # Log Pod exec and attach (privilege escalation vectors)
  - level: RequestResponse
    resources:
      - group: ""
        resources: ["pods/exec", "pods/attach", "pods/portforward"]

  # Log all create/update/delete with request body
  - level: Request
    verbs: ["create", "update", "patch", "delete"]
    omitStages:
      - RequestReceived

  # Default: log metadata for everything else
  - level: Metadata
    omitStages:
      - RequestReceived

Enabling Audit Logging

File Backend

# kube-apiserver flags
--audit-policy-file=/etc/kubernetes/audit-policy.yaml
--audit-log-path=/var/log/kubernetes/audit.log
--audit-log-maxage=30        # Days to retain
--audit-log-maxbackup=10     # Number of backup files
--audit-log-maxsize=100      # MB per file

Webhook Backend

# kube-apiserver flags
--audit-webhook-config-file=/etc/kubernetes/audit-webhook.yaml
--audit-webhook-batch-max-size=100
--audit-webhook-batch-max-wait=5s
# audit-webhook.yaml
apiVersion: v1
kind: Config
clusters:
  - name: audit-webhook
    cluster:
      server: https://audit-collector.monitoring:8443/audit
      certificate-authority: /etc/kubernetes/pki/audit-ca.crt
contexts:
  - name: audit-webhook
    context:
      cluster: audit-webhook
current-context: audit-webhook

Log Shipping Architecture

API Server → Audit Log File → Fluentd/Vector → Elasticsearch/Splunk
                                                      ↓
                                               Kibana/Grafana
                                               Alerting Rules

Essential Audit Queries

# Who accessed secrets in the last hour?
cat audit.log | jq 'select(
  .objectRef.resource == "secrets" and
  .requestReceivedTimestamp > "2026-03-19T09:00:00Z"
) | {user: .user.username, verb: .verb, secret: .objectRef.name, ns: .objectRef.namespace}'

# What 403 Forbidden events occurred?
cat audit.log | jq 'select(.responseStatus.code == 403) |
  {user: .user.username, verb: .verb, resource: .objectRef.resource}'

# Who exec'd into Pods?
cat audit.log | jq 'select(.objectRef.resource == "pods" and .objectRef.subresource == "exec") |
  {user: .user.username, pod: .objectRef.name, ns: .objectRef.namespace}'

# Track RBAC modifications
cat audit.log | jq 'select(
  .objectRef.apiGroup == "rbac.authorization.k8s.io" and
  .verb != "get" and .verb != "list" and .verb != "watch"
)'

Compliance Mappings

| Compliance Framework | Audit Requirement | Kubernetes Mapping | |---------------------|-------------------|-------------------| | SOC 2 | Access logging | All API requests | | PCI DSS | Track access to cardholder data | Secret access, namespace-scoped logs | | HIPAA | Audit trail for PHI access | Pod exec, secret access, RBAC changes | | CIS Benchmark | API server audit enabled | File or webhook backend configured |

Performance Considerations

Audit logging adds overhead to the API server. Mitigate with:

  1. Selective logging: Use None level for high-volume, low-value events
  2. Batch webhook: Configure batching to reduce HTTP calls
  3. Separate disk: Write audit logs to a dedicated disk to avoid competing with etcd I/O
  4. Log rotation: Set maxage, maxbackup, and maxsize to prevent disk exhaustion

Managed Kubernetes Audit Logging

| Provider | How to Enable | Log Destination | |----------|--------------|-----------------| | EKS | CloudTrail + EKS audit logging | CloudWatch Logs | | GKE | Enabled by default | Cloud Logging | | AKS | Diagnostic settings | Azure Monitor / Log Analytics |

# EKS: Enable audit logs
aws eks update-cluster-config \
  --name my-cluster \
  --logging '{"clusterLogging":[{"types":["audit"],"enabled":true}]}'

Why Interviewers Ask This

Audit logging is a compliance requirement for most regulated environments and is essential for security incident investigation. This question tests your ability to implement and analyze audit trails.

Common Follow-Up Questions

What are the stages of an audit event?
RequestReceived (request arrives), ResponseStarted (response headers sent, long-running only), ResponseComplete (response body sent), and Panic (panic occurred).
How do you avoid audit log volume explosion?
Use the audit policy to skip high-volume, low-risk events (watch requests, health checks) and log only metadata for most resources. Use RequestResponse only for sensitive operations.
How do you ship audit logs to an external system?
Use the webhook backend to send events to an HTTP endpoint, or use a log shipper (Fluentd, Vector) to forward file-based audit logs to Elasticsearch, Splunk, or CloudWatch.

Key Takeaways

  • Audit policies control what is logged — too broad means noise, too narrow means missed events.
  • Always log sensitive operations (RBAC changes, secret access, Pod exec) at Request or RequestResponse level.
  • Ship audit logs to an external system for retention, searchability, and alerting.

Related Questions

You Might Also Like