← Debug Guides

DEBUG GUIDE · PERFORMANCE · SRE PLAYBOOK

Debugging Disk Full & I/O Bottlenecks.

disk performance sre storage
"No space left" has three flavours: out of blocks, out of inodes, or space held by a deleted-but-open file. Separately, slow disk = high iowait. Check the right one first.

Disk full (blocks)

df -h
du -xh / 2>/dev/null | sort -rh | head -20   # biggest dirs
# usual: /var/log, /var/lib/docker, core dumps

Fix. Rotate/truncate logs; docker system prune; grow the volume; add log rotation.

Out of inodes

"No space" but df -h shows free space — out of inodes (millions of tiny files).

df -i        # IUse% 100%
for d in /var/*; do echo "$(find $d -xdev 2>/dev/null | wc -l) $d"; done | sort -rn | head

Fix. Delete the file swarm; fix the producer.

df -h lies, df -i tells Free space but "disk full" = inode exhaustion. Always check df -i.

Deleted-but-open files

df full, du can't find it — a process deleted a big file but still holds the handle.

lsof +L1        # link count 0, still open
lsof -nP | grep '(deleted)'

Fix. Restart/signal the holding process (classic: logging to a rotated-away file).

I/O bottleneck

iostat -xz 1     # %util ~100%, await high
iotop -o         # which process

Fix. Reduce I/O (batch writes, cache, fewer fsyncs); faster storage tier; add DB indexes to cut scans.

Quick reference

df -h ; df -i ; du -xh / | sort -rh | head
lsof +L1 ; iostat -xz 1 ; iotop -o
docker system df ; docker system prune
← prev: Memory Leaks next: Timeouts →
© cvam — written in plaintext, served warm