System debugging, diagnostic tools, and methodologies — books, the right tools, and the playbooks for when things break. Links open in a new tab.
Books
| Resource | What | Link |
| The Art of Debugging (GDB/DDD/Eclipse) | Find + fix code defects. | book |
| Systems Performance — Brendan Gregg | Enterprise + cloud tuning. | book |
| Debugging: The 9 Indispensable Rules | Systematic methodology. | book |
| The Practice of System & Network Admin | Enterprise admin. | book |
Research Papers
| Resource | What | Link |
| Effective Troubleshooting — Google SRE | Systematic prod debugging. | site |
| Why Do Internet Services Fail? | Oppenheimer et al, USITS '03. | pdf |
| The Tail at Scale | Dean & Barroso, latency. | pdf |
GitHub Repositories
| Resource | What | Link |
| linux-insides | Linux internals deep dive. | repo |
| Awesome Incident Response | Tools collection. | repo |
| Linux Hardening Checklist | Debug + harden. | repo |
| Kubernetes | K8s debugging source. | repo |
Videos & Courses
| Resource | What | Link |
| Brendan Gregg — Performance Talks | Systems perf expert. | video |
| Kubernetes Debugging Docs | Official debug tasks. | site |
| Docker Troubleshooting | Official daemon guide. | site |
| Resource | What | Link |
| strace | System call tracer. | site |
| tcpdump | Packet capture. | site |
| Wireshark | Protocol analyzer. | site |
| htop | Process monitor. | site |
| perf | Kernel performance analysis. | site |
| gdb | GNU debugger. | site |
Articles & Blogs
| Resource | What | Link |
| Brendan Gregg's Blog | Systems performance. | site |
| Red Hat — RHEL Troubleshooting | Systematic approach. | site |
| Kubernetes Debug Docs | K8s troubleshooting. | site |
Recommended Reading
| Resource | What | Link |
| USE Method — Brendan Gregg | Util/Saturation/Errors checklist. | site |
| Performance Methodology | Brendan Gregg's methods. | site |
| Network Troubleshooting | Red Hat solution guide. | site |
where to start
Learn the methodology first (Brendan Gregg + Red Hat guides), then master strace/tcpdump/perf/gdb. Systems Performance is the deep reference.