Engineering
Incident Response
Responding to and cataloging high-severity incidents is a critical skill to develop for any software organization.
Learn Incident Response with the Practica AI Coach
The Practica AI Coach helps you improve in Incident Response by using your current work challenges as opportunities to improve. The AI Coach will ask you questions, instruct you on concepts and tactics, and give you feedback as you make progress.Curated Learning Resources
- Incident Management HandbookA comprehensive resource from the GitLab team outlining many aspects of incident response including: • Key Roles and Responsibilities • Runbooks • Tracking and Communication
- If Dr House did DevOpsDifferential Diagnosis (DDx) is a useful framework for software engineers to use when responding to incidents. It is based on the process used by Dr. House and his team in the TV series, where they huddle around a whiteboard to list symptoms and possible causes, and prioritize the list of causes. DDx can help make decisions in a stressful situation and train less-experienced engineers in incident response. It is also more fun to think of incident response as solving a mystery, rather than just responding to a snafu. DDx can help rule out simple, common explanations, gather data, list possible causes, and prioritize the list of causes. Treating symptoms can help uncover the root cause of the incident.
- Gitlab's Incident Classification SystemThe GitLab team's system for classifying severity and urgency of incidents.