Storage Volumes in Kubernetes
1. Motivation & Problem Setup
Warning
Containers are ephemeral: when a pod dies, its local storage is lost.
There is a need for persistence
- Database workloads (MongoDB, MySQL, PostgreSQL).
- Logs that must outlive a pod.
- Shared storage between pods.
The Kubernetes approach
- Abstract away underlying storage to achieve portability across clusters (on-prem, cloud, hybrid).
Design Emphasis
- Separation of compute resources (pods) and storage resources (volumes).
- Lifecycle mismatch: pod vs. volume.
- Pods are more ephemeral and can have shorter lifecycle (node failure, maintenance, etc)
- Volumes need to be more persistent for long-term reliable storage of data
- Tension between flexibility (dynamic provisioning) and control (security, quotas).
2. Kubernetes Volumes: The Basics
Ephemeral volumes: emptyDir
- Degree of ephemeral is tied to the pod.
- Persistence across container restarts within the same Pod.
- First created when a Pod is assigned to a node.
- Exists as long as the Pod remains on the same node.
- Useful for scratch space.
- Temporary storage
- Shared workspace among containers
- Temporary log aggregation location
Hands-on
- Create the following deployment manifest called
emptyDir.yaml
.
apiVersion: v1
kind: Pod
metadata:
name: emptydirpod
spec:
containers:
- name: my-app-container
image: nginx
volumeMounts:
- name: shared-data
mountPath: /var/data
- name: my-sidecar-container
image: busybox
command: ["sh", "-c", "echo 'hello from sidecar' > /shared/file.txt && sleep 60"]
volumeMounts:
- name: shared-data
mountPath: /shared
volumes:
- name: shared-data
emptyDir: {}
- Deploy the pod using
kubectl apply -f
. - Use
kubectl get pods
to identify the pod name - Use
kubectl describe pod
to identify the container names. - Use
kubectl exec
with additional-c
flag to get into themy-app-container
container. - Confirm the content inside
/var/data/file.txt
.
Persistent volumes: hostPath
- Direct mounting from the host node filesystem to the pod
- Not recommended for production.
- Node-specific
- Breaks abstraction: Physical to Container
- Security implications
- Persistence
Hands-on
- Create the following deployment manifest called
hostpath.yaml
.
apiVersion: v1
kind: Pod
metadata:
name: hostpathpod
spec:
containers:
- name: my-container
image: busybox
command: ["sh", "-c", "echo 'hello from the other side' > /mnt/hostpath/file.txt && sleep 3600"]
volumeMounts:
- name: host-volume
mountPath: /mnt/hostpath
volumes:
- name: host-volume
hostPath:
path: /home/YOUR_USER_NAME_HERE/hostpath
type: DirectoryOrCreate
- Deploy the pod using
kubectl apply -f
. - One the pod is ready, check the content inside
~/hostpath/
directory. - Create another file inside
~/hostpath/
directory. - Use
kubectl exec
to get into the pod and check the content of/mnt/hostpath/
directory.
3. Persistent Volumes and Claims
Persistent Volume (PV)
- Abstract representation of a peice of storage in a Kubernetes cluster.
- Provisioned by administrator or dynamically through
StorageClass
. - Independent of a Pod's lifecycle
- Represents different types of StorageClass
- Local storage
- NSF shares
- Other types of storage.
Persistent Volume Claim (PVC)
- User request for storage.
- Namespace-scoped resource that defines the desired charactersistics of the storage
- Size
- Access modes
- ReadWriteOnce (RWO): one node mounted read/write.
- ReadOnlyMany (ROX): multiple nodes, read-only.
- ReadWriteMany (RWX)L multiple nodes, read/write.
- Storage Class (optional)
- Kubernetes attempts to bind a PVC to an available PV that satisfies its requirements.
Workflow
- Administartors provisions PVs or defines Storage Classes
- Users/applications create PVC
- Kubernetes binds PVC to PV
- Pods mounts PVC as per manifest.
Reclaim Policies
- Retain
- PV is not deleted; data remains intact.
- Storage asset is orphaned until an admin manually reclaims it (by wiping or reassigning).
- Best for sensitive data (databases, regulated workloads).
- Example Use Case: Production database volumes (Postgres, MySQL, MongoDB) where accidental data loss must be avoided.
- Delete
- Both the PV object and the underlying storage asset (e.g., AWS EBS volume, GCP PD, Ceph RBD) are deleted.
- Fully automated cleanup.
- Data is lost once PVC is deleted.
- Example Use Case: Scratch workloads, CI/CD environments, development namespaces, or any scenario where ephemeral but persistent-like storage is needed.admin manually reclaims.
Diagram
flowchart TD
subgraph NodeLocal["Node-local Storage"]
A["emptyDir (ephemeral)"]
B["hostPath (node filesystem)"]
end
subgraph ClusterStorage["Cluster / External Storage"]
C["Physical backend (NFS, EBS, Ceph, Longhorn, etc.)"]
D["PersistentVolume (PV)"]
end
E["PersistentVolumeClaim (PVC)"]
F["Pod"]
%% relationships
A --> F
B --> F
C --> D
D --> E
E --> F
Key Point
- Volumes are not tied to containers, but to pods
- Data is preserved across container restarts but not across pod rescheduling, unless backed by PV/PVC.
4. PV on multi-node cluster
Warning
- A Persistent Volume (PV) is a
cluster-wide
resource. - However, the actual storage backend dictates whether a pod scheduled on another node can still access that PV.
hostPath-backed PV
- hostPath ties the PV to a directory on a specific node.
- If the pod is scheduled elsewhere, the mount fails.
- The pod may go ContainerCreating with volume mount errors.
- hostPath is not recommended for production or multi-node clusters.
Local Path provisioner (per-node local disks)
- Same limitation: bound to the node where it was created.
- The provisioner usually pins the pod to the node with the volume (via node affinity).
- Good for dev/test clusters, but limited for failover.
Networked / Remote Storage (NFS, Ceph, EBS, Longhorn, etc.)
- PV points to shared storage that all nodes can reach over the network.
- A pod rescheduled on Node 2 or Node 4, then volume is reattached automatically.
- This is the production pattern:
- PVs abstract away location, Kubernetes handles re-mounting on whichever node the pod lands.
Access Modes matter
- ReadWriteOnce (RWO): only one node can mount read/write at a time (e.g., AWS EBS).
- Pod rescheduling detaches from Node 1, attaches to Node 2.
- ReadWriteMany (RWX): multiple pods/nodes can read/write concurrently (e.g., NFS, CephFS, Longhorn).
- Needed for shared workloads (like WordPress + PHP pods writing to the same files).
- ReadOnlyMany (ROX): multiple pods/nodes can read concurrently, but no writes.
Scheduling awareness
- When a pod requests a PVC
- Kubernetes binds PVC to PV.
- If the PV backend is node-local, the scheduler will add node affinity so the pod only runs on that node.
- If the PV backend is networked, the pod can run on any node and the volume follows.
Diagram
flowchart TD
subgraph Cluster["Kubernetes Cluster"]
N1["Node 1"]
N2["Node 2"]
N3["Node 3"]
N4["Node 4"]
end
subgraph Storage["Storage Backend"]
H["hostPath (Node-local)"]
E["EBS / NFS / Ceph / Longhorn (Networked)"]
end
PV1["PV (hostPath)"] --> H
PV2["PV (Networked)"] --> E
N1 -. can use .-> PV1
N2 -. cannot use .-> PV1
N3 -. cannot use .-> PV1
N4 -. cannot use .-> PV1
N1 -- can mount --> PV2
N2 -- can mount --> PV2
N3 -- can mount --> PV2
N4 -- can mount --> PV2