Measuring Success
How do we know if the Tier 2 on-call program is working? We measure it through specific metrics that reflect both operational excellence and engineer well-being. This page explains what we track and why it matters.
Core Success Metrics
- Reduction in time to resolve: The primary purpose behind expanding to Tier 2 is to provide Subject Matter Expertise to engineers on call in order to solve incidents faster. This is a primary metric in our overall incident response when it comes to Tier 2.
- Escalation accuracy: 90%+ of escalations go to the correct team on first try because of the usability in our error messages, stack trace, observability categorization, etc
- Zero pages to Tier 2 because of the resiliency of our system, and/or the effectiveness of our runbooks
- No escalations past Tier 2 because we always respond in < 15 minutes
- Sustainable on call schedules: Engineers are not on call more than 1 week per month
Related Pages
- DevOps Rotation Leader — Rotation leaders track these metrics
- Communication and Culture — Blameless culture supports these goals
- Joining and Leaving the Rotation — Understand fairness metrics in your rotation
Last modified November 17, 2025: Condense Tier 2 measurements to Top 5 (
188cd333)
