What Are emptyDir and hostPath Volume Types in Kubernetes?
emptyDir is a temporary volume created when a Pod is assigned to a node and deleted when the Pod is removed. hostPath mounts a file or directory from the host node's filesystem into a Pod. emptyDir is safe for scratch space; hostPath is risky and generally discouraged in production.
Detailed Answer
emptyDir
An emptyDir volume is created when a Pod is scheduled onto a node and exists for the lifetime of that Pod. It starts empty and all containers in the Pod can read and write to it. When the Pod is removed from the node (for any reason), the data is permanently deleted.
Basic Usage
apiVersion: v1
kind: Pod
metadata:
name: scratch-pod
spec:
containers:
- name: writer
image: busybox
command: ["sh", "-c", "echo hello > /data/greeting.txt && sleep 3600"]
volumeMounts:
- name: scratch
mountPath: /data
- name: reader
image: busybox
command: ["sh", "-c", "cat /data/greeting.txt && sleep 3600"]
volumeMounts:
- name: scratch
mountPath: /data
volumes:
- name: scratch
emptyDir: {}
In this example, both containers share the same emptyDir volume. The writer container creates a file, and the reader container can access it. This is the standard pattern for sidecar communication.
Memory-Backed emptyDir
For workloads needing very fast I/O, you can back emptyDir with RAM:
volumes:
- name: cache
emptyDir:
medium: Memory
sizeLimit: 256Mi
Key considerations:
- Uses tmpfs (RAM-backed filesystem)
- Data counts against the container's memory limit
- Data is lost on Pod eviction or node restart
- Very fast for read/write operations
Size Limits
You can set a size limit to prevent a runaway process from filling the node's disk:
volumes:
- name: scratch
emptyDir:
sizeLimit: 1Gi
If the volume exceeds the limit, the kubelet evicts the Pod.
Common Use Cases for emptyDir
- Scratch space: Disk-based sorting, temporary computation
- Caching: Warm-up data, pre-processed assets
- Sidecar communication: Sharing files between containers in a Pod (e.g., log files, config generated by an init container)
- Build artifacts: CI/CD Pods that compile code and pass artifacts between stages
hostPath
A hostPath volume mounts a file or directory from the host node's filesystem directly into the Pod.
Basic Usage
apiVersion: v1
kind: Pod
metadata:
name: host-reader
spec:
containers:
- name: reader
image: busybox
command: ["sh", "-c", "ls /host-logs && sleep 3600"]
volumeMounts:
- name: logs
mountPath: /host-logs
readOnly: true
volumes:
- name: logs
hostPath:
path: /var/log
type: Directory
hostPath Types
| Type | Behavior |
|---|---|
| "" (empty) | No checks; mount whatever is at the path |
| DirectoryOrCreate | Creates the directory if it does not exist |
| Directory | Path must exist and be a directory |
| FileOrCreate | Creates the file if it does not exist |
| File | Path must exist and be a file |
| Socket | Path must be a Unix socket |
| CharDevice | Path must be a character device |
| BlockDevice | Path must be a block device |
volumes:
- name: docker-socket
hostPath:
path: /var/run/docker.sock
type: Socket
Why hostPath Is Dangerous
hostPath volumes create significant security and reliability problems:
- Security risk: A Pod with hostPath access can read sensitive files like
/etc/shadow, kubelet credentials, or other Pods' data. - Node coupling: The Pod is tightly bound to a specific node's filesystem, breaking portability.
- Data inconsistency: Different nodes have different filesystems, so the same hostPath may contain different data.
- No cleanup: Data written by Pods persists on the node after the Pod is deleted.
Pod Security Standards (at the Baseline and Restricted levels) block hostPath volumes by default.
Legitimate Use Cases for hostPath
Despite its risks, hostPath is appropriate in a few scenarios:
- DaemonSets that need access to node-level data (e.g., log collectors reading
/var/log, monitoring agents reading/sys) - Container runtime access (e.g., mounting the Docker or containerd socket for build tools)
- Node-level agents that need to write to the host filesystem
# Fluentd DaemonSet reading node logs
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
containers:
- name: fluentd
image: fluentd:v1.16
volumeMounts:
- name: varlog
mountPath: /var/log
readOnly: true
- name: containers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
- name: varlog
hostPath:
path: /var/log
type: Directory
- name: containers
hostPath:
path: /var/lib/docker/containers
type: Directory
Comparison Summary
| Feature | emptyDir | hostPath | |---|---|---| | Lifecycle | Pod lifecycle | Persists on node | | Scope | Pod-local | Node filesystem | | Security | Safe | Dangerous | | Portability | Fully portable | Node-specific | | Use case | Scratch/cache/sidecar | DaemonSet node access | | Pod Security | Allowed | Blocked at Baseline+ |
Best Practice
Use emptyDir for any temporary or scratch data needs within a Pod. Avoid hostPath unless you are writing a DaemonSet that genuinely needs node-level access, and always mount it as readOnly: true when possible. For persistent data, use PersistentVolumeClaims backed by a CSI driver.
Why Interviewers Ask This
Interviewers ask this to see if you know when ephemeral storage is appropriate versus persistent storage and understand the security implications of hostPath volumes.
Common Follow-Up Questions
Key Takeaways
- emptyDir is ephemeral and deleted with the Pod
- hostPath exposes the host filesystem and should be avoided in production
- Use emptyDir for scratch space, caching, and inter-container data sharing