What Are emptyDir and hostPath Volume Types in Kubernetes?

beginner|storagedevopssreCKA
TL;DR

emptyDir is a temporary volume created when a Pod is assigned to a node and deleted when the Pod is removed. hostPath mounts a file or directory from the host node's filesystem into a Pod. emptyDir is safe for scratch space; hostPath is risky and generally discouraged in production.

Detailed Answer

emptyDir

An emptyDir volume is created when a Pod is scheduled onto a node and exists for the lifetime of that Pod. It starts empty and all containers in the Pod can read and write to it. When the Pod is removed from the node (for any reason), the data is permanently deleted.

Basic Usage

apiVersion: v1
kind: Pod
metadata:
  name: scratch-pod
spec:
  containers:
    - name: writer
      image: busybox
      command: ["sh", "-c", "echo hello > /data/greeting.txt && sleep 3600"]
      volumeMounts:
        - name: scratch
          mountPath: /data
    - name: reader
      image: busybox
      command: ["sh", "-c", "cat /data/greeting.txt && sleep 3600"]
      volumeMounts:
        - name: scratch
          mountPath: /data
  volumes:
    - name: scratch
      emptyDir: {}

In this example, both containers share the same emptyDir volume. The writer container creates a file, and the reader container can access it. This is the standard pattern for sidecar communication.

Memory-Backed emptyDir

For workloads needing very fast I/O, you can back emptyDir with RAM:

volumes:
  - name: cache
    emptyDir:
      medium: Memory
      sizeLimit: 256Mi

Key considerations:

  • Uses tmpfs (RAM-backed filesystem)
  • Data counts against the container's memory limit
  • Data is lost on Pod eviction or node restart
  • Very fast for read/write operations

Size Limits

You can set a size limit to prevent a runaway process from filling the node's disk:

volumes:
  - name: scratch
    emptyDir:
      sizeLimit: 1Gi

If the volume exceeds the limit, the kubelet evicts the Pod.

Common Use Cases for emptyDir

  • Scratch space: Disk-based sorting, temporary computation
  • Caching: Warm-up data, pre-processed assets
  • Sidecar communication: Sharing files between containers in a Pod (e.g., log files, config generated by an init container)
  • Build artifacts: CI/CD Pods that compile code and pass artifacts between stages

hostPath

A hostPath volume mounts a file or directory from the host node's filesystem directly into the Pod.

Basic Usage

apiVersion: v1
kind: Pod
metadata:
  name: host-reader
spec:
  containers:
    - name: reader
      image: busybox
      command: ["sh", "-c", "ls /host-logs && sleep 3600"]
      volumeMounts:
        - name: logs
          mountPath: /host-logs
          readOnly: true
  volumes:
    - name: logs
      hostPath:
        path: /var/log
        type: Directory

hostPath Types

| Type | Behavior | |---|---| | "" (empty) | No checks; mount whatever is at the path | | DirectoryOrCreate | Creates the directory if it does not exist | | Directory | Path must exist and be a directory | | FileOrCreate | Creates the file if it does not exist | | File | Path must exist and be a file | | Socket | Path must be a Unix socket | | CharDevice | Path must be a character device | | BlockDevice | Path must be a block device |

volumes:
  - name: docker-socket
    hostPath:
      path: /var/run/docker.sock
      type: Socket

Why hostPath Is Dangerous

hostPath volumes create significant security and reliability problems:

  1. Security risk: A Pod with hostPath access can read sensitive files like /etc/shadow, kubelet credentials, or other Pods' data.
  2. Node coupling: The Pod is tightly bound to a specific node's filesystem, breaking portability.
  3. Data inconsistency: Different nodes have different filesystems, so the same hostPath may contain different data.
  4. No cleanup: Data written by Pods persists on the node after the Pod is deleted.

Pod Security Standards (at the Baseline and Restricted levels) block hostPath volumes by default.

Legitimate Use Cases for hostPath

Despite its risks, hostPath is appropriate in a few scenarios:

  • DaemonSets that need access to node-level data (e.g., log collectors reading /var/log, monitoring agents reading /sys)
  • Container runtime access (e.g., mounting the Docker or containerd socket for build tools)
  • Node-level agents that need to write to the host filesystem
# Fluentd DaemonSet reading node logs
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      containers:
        - name: fluentd
          image: fluentd:v1.16
          volumeMounts:
            - name: varlog
              mountPath: /var/log
              readOnly: true
            - name: containers
              mountPath: /var/lib/docker/containers
              readOnly: true
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
            type: Directory
        - name: containers
          hostPath:
            path: /var/lib/docker/containers
            type: Directory

Comparison Summary

| Feature | emptyDir | hostPath | |---|---|---| | Lifecycle | Pod lifecycle | Persists on node | | Scope | Pod-local | Node filesystem | | Security | Safe | Dangerous | | Portability | Fully portable | Node-specific | | Use case | Scratch/cache/sidecar | DaemonSet node access | | Pod Security | Allowed | Blocked at Baseline+ |

Best Practice

Use emptyDir for any temporary or scratch data needs within a Pod. Avoid hostPath unless you are writing a DaemonSet that genuinely needs node-level access, and always mount it as readOnly: true when possible. For persistent data, use PersistentVolumeClaims backed by a CSI driver.

Why Interviewers Ask This

Interviewers ask this to see if you know when ephemeral storage is appropriate versus persistent storage and understand the security implications of hostPath volumes.

Common Follow-Up Questions

What happens to emptyDir data when a container crashes?
The data survives container crashes because emptyDir is tied to the Pod lifecycle, not the container lifecycle. It is only deleted when the Pod is removed from the node.
Why is hostPath considered a security risk?
It gives Pods access to the host filesystem, which can be exploited to read secrets, modify system files, or escape the container sandbox.
Can you use emptyDir backed by memory?
Yes. Setting medium: Memory creates a tmpfs mount that uses RAM. It is fast but counts against the container's memory limit.

Key Takeaways

  • emptyDir is ephemeral and deleted with the Pod
  • hostPath exposes the host filesystem and should be avoided in production
  • Use emptyDir for scratch space, caching, and inter-container data sharing