What Is the Kubernetes Logging Architecture?

Q: What Is the Kubernetes Logging Architecture?

Kubernetes does not provide a built-in log aggregation solution. Container logs are written to stdout/stderr, captured by the container runtime, and stored on the node. You need a logging agent (Fluentd, Fluent Bit, Vector) to ship logs to a centralized backend like Elasticsearch or Loki.

Detailed Answer

Kubernetes does not include a cluster-wide logging solution. It provides a basic framework where container runtimes capture stdout/stderr, but the responsibility for aggregating, storing, and querying logs falls on the operator.

How Container Logs Work

Application → stdout/stderr → Container Runtime → Node Filesystem
                                                    /var/log/containers/
                                                    /var/log/pods/

The container runtime (containerd, CRI-O) captures each container's standard output and writes it to log files on the node:

/var/log/containers/<pod>_<namespace>_<container>-<hash>.log
/var/log/pods/<namespace>_<pod>_<uid>/<container>/0.log

kubectl logs

# View current logs
kubectl logs my-pod

# Follow logs in real-time
kubectl logs my-pod -f

# View logs from a specific container
kubectl logs my-pod -c sidecar

# View previous container's logs (after crash)
kubectl logs my-pod --previous

# View logs with timestamps
kubectl logs my-pod --timestamps=true

# Tail last 100 lines
kubectl logs my-pod --tail=100

# Logs from the last hour
kubectl logs my-pod --since=1h

Log Rotation

The kubelet manages log rotation to prevent disk exhaustion:

# kubelet configuration
containerLogMaxSize: "10Mi"     # Rotate when file exceeds 10MB
containerLogMaxFiles: 5         # Keep 5 rotated files

Logging Architectures

1. Node-Level Logging Agent (Recommended)

Deploy a DaemonSet that reads log files from every node:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: logging
spec:
  selector:
    matchLabels:
      app: fluent-bit
  template:
    metadata:
      labels:
        app: fluent-bit
    spec:
      serviceAccountName: fluent-bit
      containers:
        - name: fluent-bit
          image: fluent/fluent-bit:3.0
          volumeMounts:
            - name: varlog
              mountPath: /var/log
              readOnly: true
            - name: containers
              mountPath: /var/lib/docker/containers
              readOnly: true
            - name: config
              mountPath: /fluent-bit/etc/
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
        - name: containers
          hostPath:
            path: /var/lib/docker/containers
        - name: config
          configMap:
            name: fluent-bit-config

2. Sidecar Container

For applications that write logs to files instead of stdout:

apiVersion: v1
kind: Pod
metadata:
  name: app-with-sidecar
spec:
  containers:
    - name: app
      image: legacy-app:1.0
      volumeMounts:
        - name: logs
          mountPath: /var/log/app
      resources:
        requests:
          cpu: "250m"
          memory: "256Mi"
    - name: log-forwarder
      image: fluent/fluent-bit:3.0
      volumeMounts:
        - name: logs
          mountPath: /var/log/app
          readOnly: true
      resources:
        requests:
          cpu: "50m"
          memory: "64Mi"
  volumes:
    - name: logs
      emptyDir: {}

3. Application-Level Logging

Applications send logs directly to the backend (not recommended for most cases):

Application → HTTP/gRPC → Elasticsearch/Loki

This avoids node-level agents but couples the application to the logging infrastructure.

Popular Logging Stacks

| Stack | Components | Best For | |-------|-----------|----------| | EFK | Elasticsearch + Fluent Bit + Kibana | Full-text search, rich querying | | ELK | Elasticsearch + Logstash + Kibana | Complex log transformation | | PLG | Promtail + Loki + Grafana | Cost-effective, label-based queries | | Vector + ClickHouse | Vector + ClickHouse + Grafana | High-volume, analytical queries |

Fluent Bit Configuration

# fluent-bit-config ConfigMap
[SERVICE]
    Flush         5
    Log_Level     info
    Parsers_File  parsers.conf

[INPUT]
    Name              tail
    Path              /var/log/containers/*.log
    Parser            cri
    Tag               kube.*
    Mem_Buf_Limit     5MB
    Skip_Long_Lines   On
    Refresh_Interval  10

[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_URL            https://kubernetes.default.svc:443
    Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
    Merge_Log           On
    K8S-Logging.Parser  On

[OUTPUT]
    Name            es
    Match           *
    Host            elasticsearch.logging
    Port            9200
    Index           kubernetes-logs
    Type            _doc

Loki-Based Stack (Lightweight Alternative)

helm install loki grafana/loki-stack \
  --namespace logging --create-namespace \
  --set promtail.enabled=true \
  --set grafana.enabled=true

Loki indexes only labels (namespace, Pod name, container), making it much cheaper than Elasticsearch for log storage. The trade-off is less powerful full-text search.

Structured Logging

For effective log querying, use structured (JSON) logging:

{"level":"info","ts":"2026-03-19T10:30:00Z","msg":"Request processed","method":"POST","path":"/api/orders","status":201,"duration_ms":45,"user_id":"12345"}

Configure Fluent Bit to parse JSON logs:

[FILTER]
    Name         parser
    Match        kube.*
    Key_Name     log
    Parser       json
    Reserve_Data On

Best Practices

Log to stdout/stderr — not to files inside the container
Use structured logging (JSON) — enables field-level querying
Include request IDs — for distributed tracing correlation
Set log levels appropriately — debug in dev, info/warn in production
Implement log retention policies — 7-30 days for most workloads
Monitor the logging pipeline — a broken log shipper means silent failures
Use node-level agents (DaemonSet) — more efficient than sidecars for most cases