Back to blog

'No Space Left on Device' with 40% Disk Free: The Inode and OverlayFS Death Spiral

|

We ran out of inodes long before disk, and it hurt. “Postgres is crashing with ‘No space left on device’ but we have 40% disk free.” The on-call engineer had checked df -h—plenty of space. They’d checked PersistentVolumeClaims—well under capacity. They’d even checked du -sh /* on the node—nothing unusual. But the PostgreSQL container kept crash-looping, logging write failures to WAL files.

I SSH’d to the node and ran df -i. Inode usage: 98%. That’s when everything clicked. The node’s filesystem had plenty of bytes free, but almost no inodes left. Every file, directory, and symlink consumes one inode, regardless of its size. We’d exhausted our inode allocation while barely touching our byte allocation—and none of our monitoring dashboards tracked inode usage.

The culprit was a logging sidecar that had been writing individual JSON files for each log entry instead of appending to a single file. Over weeks, it had created millions of tiny files in overlayfs layers. The files themselves were small—total maybe 500MB—but they’d consumed 12 million inodes. When Postgres tried to create a new WAL segment, the kernel returned ENOSPC even though there was 40GB free. From Postgres’s perspective, it looked like disk corruption or storage system failure.

What made this particularly insidious was the overlayfs amplification effect. Container images are composed of layers, and each layer maintains its own inode accounting. A container with many layers, running workloads that create lots of small files, can exhaust inodes much faster than a traditional filesystem because inodes are consumed in overlay’s upper layer even for operations that seem like modifications.

Environment: Kubernetes 1.28+, containerd with overlayfs, nodes running ephemeral workloads, logging and debugging tools

Understanding Inode Exhaustion

Bytes vs Inodes

Filesystem resources are two-dimensional:

Bytes (what df -h shows):
┌─────────────────────────────────────┐
│████████████████████░░░░░░░░░░░░░░░░│ 60% used
└─────────────────────────────────────┘
  Tracks: How much data is stored

Inodes (what df -i shows):
┌─────────────────────────────────────┐
│███████████████████████████████████░│ 98% used ← PROBLEM!
└─────────────────────────────────────┘
  Tracks: How many files/directories exist

One inode per:
  - Regular file (any size, 1 byte or 1TB)
  - Directory
  - Symbolic link
  - Named pipe
  - Socket file

A 1KB log file and a 1GB video both use exactly 1 inode.

How Inode Exhaustion Happens

Common patterns that exhaust inodes:

1. Many small files (per-request logs, temp files)
   10 million 100-byte files = 10M inodes + ~1GB bytes

2. Container image churn (many layers stacked)
   Each image layer has its own files
   100 images × 10,000 files/image = 1M inodes

3. Debug/profiling artifacts
   Core dumps, heap dumps, trace files
   pprof profiles creating thousands of temp files

4. Package managers gone wrong
   node_modules, pip downloads, maven caches
   npm install creating 50,000+ files

5. Log rotation without cleanup
   logrotate creating numbered files forever
   Each rotated file: 1 inode

OverlayFS Amplification

Container overlayfs structure:

Lower layers (read-only, from image):
├── layer1/ (base image: 5,000 files)
├── layer2/ (dependencies: 20,000 files)
└── layer3/ (application: 2,000 files)

Upper layer (writable, container runtime):
└── upper/ (container writes go here)

Merged view (what container sees):
└── merged/ (union of all layers)

Problem scenarios:

1. Modifying a file from lower layer:
   - Copy-up to upper layer = new inode in upper
   - Original inode in lower still exists
   - 1 logical file = 2 inodes consumed

2. Creating lots of temp files:
   - All go to upper layer
   - Each temp file = 1 inode
   - Deleting file marks it as "whiteout" (another inode!)

3. Many running containers:
   - Each container has its own upper layer
   - Upper layers are on node's root filesystem
   - All containers share the same inode pool

Diagnosing Inode Exhaustion

Check Inode Usage

# Basic inode check
df -i

# Output:
# Filesystem      Inodes    IUsed   IFree IUse% Mounted on
# /dev/sda1     12582912 12345678  237234   98% /
#                                          ↑ DANGER!

# If IUse% > 90%, you're in danger zone
# If IUse% = 100%, containers will crash with ENOSPC

# Check specific mount points
df -i /var/lib/containerd
df -i /var/lib/docker
df -i /var/lib/kubelet

Find Inode-Heavy Directories

# Count files recursively (slow but accurate)
find /var/lib/containerd -type f | wc -l

# Faster: Use filesystem debugging tools
# For ext4:
debugfs -R "stats" /dev/sda1 | grep -i inode

# Find directories with most files
for dir in /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/*/fs; do
  echo "$(find "$dir" -type f 2>/dev/null | wc -l) $dir"
done | sort -rn | head -20

# Find containers with most files in upper layer
for upper in /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/*/fs; do
  if [ -d "$upper" ]; then
    count=$(find "$upper" -type f 2>/dev/null | wc -l)
    echo "$count $upper"
  fi
done | sort -rn | head -10

Identify the Offender

# Find which container/pod is creating files
# Check overlay mount info
mount | grep overlay

# Get container ID from mount path
# /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/12345/fs
# 12345 is the snapshot ID

# Map snapshot to container
crictl ps -o json | jq -r '.containers[] | "\(.id) \(.metadata.name)"'

# Check container's current file count
CONTAINER_ID="abc123"
ROOTFS=$(crictl inspect $CONTAINER_ID | jq -r '.info.runtimeSpec.root.path')
find $ROOTFS -type f | wc -l

# Watch file creation in real-time
inotifywait -m -r /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/ \
  -e create -e delete 2>/dev/null | head -100

Application-Level Investigation

# Inside the container: find file-heavy directories
kubectl exec -it $POD -- sh -c 'find / -type f 2>/dev/null | cut -d/ -f2 | sort | uniq -c | sort -rn | head'

# Common culprits:
# 500000 tmp       ← temp files not cleaned up
# 200000 var       ← logs, caches
# 100000 app       ← node_modules, vendor

# Find recent file creators
kubectl exec -it $POD -- sh -c 'find /tmp -type f -mmin -60 | head -20'

The Fix

Immediate: Clean Up Inodes

# Delete old container images (on node)
crictl rmi --prune

# Remove stopped containers
crictl rm $(crictl ps -a -q --state exited)

# Clean containerd snapshots
# WARNING: Only do this during maintenance window
ctr -n k8s.io snapshots rm <snapshot-id>

# Force garbage collection
crictl gc

# Find and delete obvious temp file spam
find /var/lib/containerd -name "*.tmp" -mtime +1 -delete
find /var/lib/containerd -name "*.log" -size 0 -delete

Fix the Application

# Bad: One file per log entry
for event in events:
    with open(f'/logs/event_{event.id}.json', 'w') as f:
        json.dump(event, f)
# Creates millions of files

# Good: Append to single file
with open('/logs/events.jsonl', 'a') as f:
    for event in events:
        f.write(json.dumps(event) + '\n')
# Creates 1 file

# Better: Use structured logging to stdout
import logging
import json
logger = logging.getLogger()
for event in events:
    logger.info(json.dumps(event))
# Creates 0 files (logs go to container runtime)
// Bad: Temp file per request
func handleRequest(r *Request) {
    f, _ := ioutil.TempFile("", "request-*")
    // Process with temp file
    // Forget to delete
}

// Good: Use memory or single temp file with cleanup
func handleRequest(r *Request) {
    f, _ := ioutil.TempFile("", "request-*")
    defer os.Remove(f.Name())  // Always clean up
    defer f.Close()
    // Process
}

// Better: Use in-memory buffer
func handleRequest(r *Request) {
    buf := bytes.NewBuffer(nil)
    // Process in memory
}

Kubernetes Resource Limits

# Limit ephemeral storage (includes inodes indirectly)
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: app
    resources:
      limits:
        ephemeral-storage: "2Gi"  # Total ephemeral storage
      requests:
        ephemeral-storage: "1Gi"
    # Kubelet will evict pod if it exceeds limit
# Use emptyDir with size limit for temp files
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: app
    volumeMounts:
    - name: temp
      mountPath: /tmp
  volumes:
  - name: emptyDir
    emptyDir:
      sizeLimit: 500Mi  # Caps total size
      # Note: Does NOT cap inode count directly

Node-Level Prevention

# Increase inode count when formatting filesystem
# (Only possible at filesystem creation time)
mkfs.ext4 -N 50000000 /dev/sda1  # 50 million inodes

# For existing nodes: Use XFS instead of ext4
# XFS allocates inodes dynamically, harder to exhaust
mkfs.xfs /dev/sda1

# Configure containerd to use separate filesystem
# /etc/containerd/config.toml
[plugins."io.containerd.grpc.v1.cri".containerd]
  snapshotter = "overlayfs"
  # Put snapshots on dedicated volume with more inodes
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
    root = "/data/containerd"  # Separate volume

Image Hygiene

# Bad: Many layers with many files
FROM node:18
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
RUN npm prune --production
# Result: node_modules in multiple layers = inode explosion

# Good: Multi-stage build, minimal final image
FROM node:18 AS builder
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build

FROM node:18-slim
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
# Result: Fewer layers, fewer duplicate files
# Clean up unused images regularly
# Kubernetes garbage collection
# /var/lib/kubelet/config.yaml
imageGCHighThresholdPercent: 85
imageGCLowThresholdPercent: 80
imageMinimumGCAge: 2m

# Manual cleanup
crictl rmi --prune
docker system prune -a -f

Monitoring

Prometheus Alerts

groups:
- name: inode-exhaustion
  rules:
  - alert: InodeUsageHigh
    expr: |
      (1 - node_filesystem_files_free{mountpoint="/"}
         / node_filesystem_files{mountpoint="/"}) > 0.85
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: "Inode usage >85% on {{ $labels.instance }}"
      description: "Check for temp file spam or container image churn"

  - alert: InodeUsageCritical
    expr: |
      (1 - node_filesystem_files_free{mountpoint="/"}
         / node_filesystem_files{mountpoint="/"}) > 0.95
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "Inode usage >95% on {{ $labels.instance }}"
      description: "Imminent ENOSPC for new file creation"

  - alert: InodeVsBytesMismatch
    expr: |
      # High inode usage but low byte usage = many small files
      (1 - node_filesystem_files_free{mountpoint="/"}
         / node_filesystem_files{mountpoint="/"}) > 0.8
      AND
      (1 - node_filesystem_avail_bytes{mountpoint="/"}
         / node_filesystem_size_bytes{mountpoint="/"}) < 0.5
    for: 30m
    labels:
      severity: warning
    annotations:
      summary: "Inode/bytes mismatch on {{ $labels.instance }}"
      description: "Many small files detected - investigate temp file spam"

Grafana Dashboard Queries

# Inode usage percentage
(1 - node_filesystem_files_free{mountpoint="/"} / node_filesystem_files{mountpoint="/"}) * 100

# Free inodes (raw count)
node_filesystem_files_free{mountpoint="/"}

# Inode usage rate (are we trending toward exhaustion?)
deriv(node_filesystem_files_free{mountpoint="/"}[1h])

# Bytes vs inodes comparison
# Panel 1: Byte usage %
(1 - node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100

# Panel 2: Inode usage %
(1 - node_filesystem_files_free{mountpoint="/"} / node_filesystem_files{mountpoint="/"}) * 100

# If inode % >> byte %, you have many small files

Node Health Check Script

#!/bin/bash
# inode-check.sh - Run via DaemonSet or cron

THRESHOLD=85

INODE_PCT=$(df -i / | awk 'NR==2 {gsub(/%/,""); print $5}')
BYTE_PCT=$(df -h / | awk 'NR==2 {gsub(/%/,""); print $5}')

if [ "$INODE_PCT" -gt "$THRESHOLD" ]; then
    echo "CRITICAL: Inode usage at ${INODE_PCT}% (bytes: ${BYTE_PCT}%)"

    # Find top inode consumers
    echo "Top inode consumers:"
    for dir in /var/lib/containerd /var/lib/docker /var/log /tmp; do
        if [ -d "$dir" ]; then
            count=$(find "$dir" -type f 2>/dev/null | wc -l)
            echo "  $dir: $count files"
        fi
    done

    exit 1
fi

echo "OK: Inode usage at ${INODE_PCT}%"
exit 0

Checklist

## Inode Exhaustion Prevention and Response

### Detection
- [ ] Check df -i (not just df -h)
- [ ] Compare inode% vs byte% (mismatch = many small files)
- [ ] Find directories with most files
- [ ] Identify container/pod creating files

### Immediate Fix
- [ ] Clean up temp files and logs
- [ ] Prune unused container images
- [ ] Delete old snapshots
- [ ] Restart offending workloads

### Prevention
- [ ] Add inode monitoring to dashboards
- [ ] Alert on inode usage > 85%
- [ ] Set ephemeral storage limits on pods
- [ ] Use multi-stage Docker builds

### Application Fixes
- [ ] Log to stdout, not individual files
- [ ] Clean up temp files after use
- [ ] Use append-mode for logs, not file-per-entry
- [ ] Limit cache/temp directories

Conclusion

Inode exhaustion is one of those failure modes that feels like it shouldn’t happen in 2024. It’s an old-school Unix problem that modern engineers rarely encounter—until containers make it relevant again. The overlayfs storage driver, combined with workloads that generate many small files, can exhaust inodes surprisingly quickly. And because most monitoring dashboards only track byte usage, the problem is invisible until containers start failing with cryptic ENOSPC errors.

The core issue is that df -h lies by omission. It shows you byte usage, which is almost always fine. It doesn’t show you inode usage, which might be at 98%. When Postgres tries to create a new WAL file and gets ENOSPC, no amount of staring at “40% disk free” will help you diagnose it. You have to know to run df -i.

The fix is usually straightforward once you identify the problem: clean up temp files, prune container images, fix the application that’s spamming small files. The harder part is prevention—making sure your monitoring includes inode usage, and that your applications are designed to avoid creating millions of tiny files.

Key principles:

  1. df -h shows bytes, df -i shows inodes—you need both to understand disk health
  2. One file = one inode regardless of size—a million 1-byte files uses a million inodes
  3. OverlayFS amplifies inode consumption—copy-up and whiteouts create extra inodes
  4. Container image churn exhausts inodes—layers accumulate files across images
  5. Log to stdout, not per-request files—let the container runtime handle log aggregation

Check your nodes’ inode usage now. The next mysterious ENOSPC might already be approaching.


Related posts

Cite this article

If you reference this post, please link to the original URL and credit the author.

Michal Drozd. "'No Space Left on Device' with 40% Disk Free: The Inode and OverlayFS Death Spiral". https://www.michal-drozd.com/en/blog/kubernetes-inode-exhaustion-overlayfs/ (Published December 7, 2025).