What Is the Kubernetes Logging Architecture?

intermediate|monitoringdevopssrebackend developerCKA
TL;DR

Kubernetes does not provide a built-in log aggregation solution. Container logs are written to stdout/stderr, captured by the container runtime, and stored on the node. You need a logging agent (Fluentd, Fluent Bit, Vector) to ship logs to a centralized backend like Elasticsearch or Loki.

Detailed Answer

Kubernetes does not include a cluster-wide logging solution. It provides a basic framework where container runtimes capture stdout/stderr, but the responsibility for aggregating, storing, and querying logs falls on the operator.

How Container Logs Work

Application → stdout/stderr → Container Runtime → Node Filesystem
                                                    /var/log/containers/
                                                    /var/log/pods/

The container runtime (containerd, CRI-O) captures each container's standard output and writes it to log files on the node:

/var/log/containers/<pod>_<namespace>_<container>-<hash>.log
/var/log/pods/<namespace>_<pod>_<uid>/<container>/0.log

kubectl logs

# View current logs
kubectl logs my-pod

# Follow logs in real-time
kubectl logs my-pod -f

# View logs from a specific container
kubectl logs my-pod -c sidecar

# View previous container's logs (after crash)
kubectl logs my-pod --previous

# View logs with timestamps
kubectl logs my-pod --timestamps=true

# Tail last 100 lines
kubectl logs my-pod --tail=100

# Logs from the last hour
kubectl logs my-pod --since=1h

Log Rotation

The kubelet manages log rotation to prevent disk exhaustion:

# kubelet configuration
containerLogMaxSize: "10Mi"     # Rotate when file exceeds 10MB
containerLogMaxFiles: 5         # Keep 5 rotated files

Logging Architectures

1. Node-Level Logging Agent (Recommended)

Deploy a DaemonSet that reads log files from every node:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: logging
spec:
  selector:
    matchLabels:
      app: fluent-bit
  template:
    metadata:
      labels:
        app: fluent-bit
    spec:
      serviceAccountName: fluent-bit
      containers:
        - name: fluent-bit
          image: fluent/fluent-bit:3.0
          volumeMounts:
            - name: varlog
              mountPath: /var/log
              readOnly: true
            - name: containers
              mountPath: /var/lib/docker/containers
              readOnly: true
            - name: config
              mountPath: /fluent-bit/etc/
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
        - name: containers
          hostPath:
            path: /var/lib/docker/containers
        - name: config
          configMap:
            name: fluent-bit-config

2. Sidecar Container

For applications that write logs to files instead of stdout:

apiVersion: v1
kind: Pod
metadata:
  name: app-with-sidecar
spec:
  containers:
    - name: app
      image: legacy-app:1.0
      volumeMounts:
        - name: logs
          mountPath: /var/log/app
      resources:
        requests:
          cpu: "250m"
          memory: "256Mi"
    - name: log-forwarder
      image: fluent/fluent-bit:3.0
      volumeMounts:
        - name: logs
          mountPath: /var/log/app
          readOnly: true
      resources:
        requests:
          cpu: "50m"
          memory: "64Mi"
  volumes:
    - name: logs
      emptyDir: {}

3. Application-Level Logging

Applications send logs directly to the backend (not recommended for most cases):

Application → HTTP/gRPC → Elasticsearch/Loki

This avoids node-level agents but couples the application to the logging infrastructure.

Popular Logging Stacks

| Stack | Components | Best For | |-------|-----------|----------| | EFK | Elasticsearch + Fluent Bit + Kibana | Full-text search, rich querying | | ELK | Elasticsearch + Logstash + Kibana | Complex log transformation | | PLG | Promtail + Loki + Grafana | Cost-effective, label-based queries | | Vector + ClickHouse | Vector + ClickHouse + Grafana | High-volume, analytical queries |

Fluent Bit Configuration

# fluent-bit-config ConfigMap
[SERVICE]
    Flush         5
    Log_Level     info
    Parsers_File  parsers.conf

[INPUT]
    Name              tail
    Path              /var/log/containers/*.log
    Parser            cri
    Tag               kube.*
    Mem_Buf_Limit     5MB
    Skip_Long_Lines   On
    Refresh_Interval  10

[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_URL            https://kubernetes.default.svc:443
    Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
    Merge_Log           On
    K8S-Logging.Parser  On

[OUTPUT]
    Name            es
    Match           *
    Host            elasticsearch.logging
    Port            9200
    Index           kubernetes-logs
    Type            _doc

Loki-Based Stack (Lightweight Alternative)

helm install loki grafana/loki-stack \
  --namespace logging --create-namespace \
  --set promtail.enabled=true \
  --set grafana.enabled=true

Loki indexes only labels (namespace, Pod name, container), making it much cheaper than Elasticsearch for log storage. The trade-off is less powerful full-text search.

Structured Logging

For effective log querying, use structured (JSON) logging:

{"level":"info","ts":"2026-03-19T10:30:00Z","msg":"Request processed","method":"POST","path":"/api/orders","status":201,"duration_ms":45,"user_id":"12345"}

Configure Fluent Bit to parse JSON logs:

[FILTER]
    Name         parser
    Match        kube.*
    Key_Name     log
    Parser       json
    Reserve_Data On

Best Practices

  1. Log to stdout/stderr — not to files inside the container
  2. Use structured logging (JSON) — enables field-level querying
  3. Include request IDs — for distributed tracing correlation
  4. Set log levels appropriately — debug in dev, info/warn in production
  5. Implement log retention policies — 7-30 days for most workloads
  6. Monitor the logging pipeline — a broken log shipper means silent failures
  7. Use node-level agents (DaemonSet) — more efficient than sidecars for most cases

Why Interviewers Ask This

Logging is the foundation of observability. This question tests whether you understand how logs flow through the system and how to implement centralized logging for production clusters.

Common Follow-Up Questions

Where does kubectl logs get its data from?
kubectl logs reads from the container runtime's log files on the node (typically /var/log/containers/). The kubelet proxies the request.
What happens to logs when a Pod is deleted?
Logs are deleted with the Pod. This is why centralized logging is essential — without it, logs from crashed or scaled-down Pods are lost permanently.
What is the sidecar logging pattern?
A sidecar container reads application logs from a shared volume and forwards them to a logging backend. This is useful when the application writes logs to files instead of stdout.

Key Takeaways

  • Kubernetes captures stdout/stderr from containers — applications should log to stdout, not files.
  • Logs are ephemeral — they are lost when Pods are deleted unless shipped to a centralized backend.
  • Use a DaemonSet-based logging agent (Fluent Bit, Vector) for efficient, node-level log collection.

Related Questions

You Might Also Like