Incident Response
Curated Articles
- Incident Management Handbook
A comprehensive resource from the GitLab team outlining many aspects of incident response including: • Key Roles and Responsibilities • Runbooks • Tracking and Communication
- If Dr House did DevOps
Differential Diagnosis (DDx) is a useful framework for software engineers to use when responding to incidents. It is based on the process used by Dr. House and his team in the TV series, where they huddle around a whiteboard to list symptoms and possible causes, and prioritize the list of causes. DDx can help make decisions in a stressful situation and train less-experienced engineers in incident response. It is also more fun to think of incident response as solving a mystery, rather than just responding to a snafu. DDx can help rule out simple, common explanations, gather data, list possible causes, and prioritize the list of causes. Treating symptoms can help uncover the root cause of the incident.
- Gitlab's Incident Classification System
The GitLab team's system for classifying severity and urgency of incidents.