On-Call Rotation
Curated Articles
- How we designed our Engineering On-Call Process
Sarah covers: • Artsy's motivation for formalizing support • The research they undertook and the goals they established • The initial on-call plan they established
- GitLab's On-Call Rotation Process
GitLab's internal process document for on-call rotations covers: • Expectations for On-Call • On-Call Rotation processes for Customer Emergencies, Reliability Engineering, Security, Development Team, and Quality Team
- Don't follow the sun
Will argues that on-call should not split into multiple shifts within a 24-hour window, with reasons that include: • Shifts start to turn into service windows, as opposed to exception management • Shift transitions magnify error rate • Humans are too slow, anyway; the focus should be on automation that prevents exceptions
- Is your team’s new engineer ready to take on-call? Use wargames for training
The Qualtrics team shares their experience using wargames to train new employees and stress test their on-call processes and runbooks.
- Crafting sustainable on-call rotations
Ryn explains: • When to trigger off-hours alerts • How to track how many off-hours alerts occur • Creating and sustaining work/life balance