Disaster Recovery Working Group
The Disaster Recovery Working Group will determine what is needed to introduce a disaster recovery mechanism for GitLab.com.
Attributes
Property | Value |
---|---|
Date Restarted | August 1, 2022 |
Date Created | November 11, 2020 |
End Date | TBD |
Slack | #wg_disaster-recovery (only accessible from within the company) |
Google Doc | Working Group Agenda (only accessible from within the company) |
Issue Board | Working Group Issue Board |
Epic | Link |
Overview & Status | Main Epic, Internal Handbook (more specific) |
Charter
The Disaster Recovery Working Group will determine the work needed to improve the disaster recovery mechanism for GitLab SaaS Products, and the effort is necessary to build a reliable and predictable disaster recovery at the largest scale, leveraging existing tools.
Scope and Definitions
In the context of this working group:
- Recovery Point Objective (RPO): maximum duration of time in which data might be lost due to an incident.
- Recovery Time Objective (RTO): maximum duration of time that a service is unavailable due to an incident.
Exit criteria
The exit criteria and target goals for the working group are defined here in the internal handbook.
Sequence Order Of Deliverables and Exit Criteria
Planned:
- Complete an assessment of zonal outage and identify next step iterations towards 4 hour recovery goal (Epic: gitlab.com&1900). DRI: John Jarvis
- Improve node snapshot capabilities DRI: John Jarvis
- Define a medium to long term strategy for DR capabilities for GitLab Dedicated and Cells via Geo. DRI: Sampath Ranasinghe
Completed:
- Create and update a single handbook page, and deprecate resources in other locations. DRI: Fabian Zimmer
- Define and clarify the FY24 recovery goals DRI: Steve Loyd
Roles and Responsibilities
Working Group Role | Person | Title |
---|---|---|
Executive Stakeholder | Jörg Heilig | CTO |
Facilitator/DRI | Andras Horvath | Engineering Manager, Gitaly |
Product Management DRI | Mark Wood | Senior Product Manager, Gitaly |
Member | Gerardo Lopez-Fernandez | Engineering Fellow, Infrastructure |
Member | Chun Du | Director of Engineering, Enablement |
Member | Juan Silva | Fullstack Engineering Manager, Geo |
Member | Sampath Ranasinghe | Senior Product Manager, Geo |
Member | John Jarvis | Staff SRE, Infrastructure |
Member | Michele Bursi | Engineering Manager, Delivery |
Member | Sami Hiltunen | Senior Backend Engineer, Gitaly |
Member | Joshua Lambert | Director of Product Management, Enablement |
Member | Steve Azzopardi | Staff SRE, Infrastructure. |
Member | Fabian Zimmer | Director of Product Management, SaaS Platforms |
Member | Nick Westbury | Senior Software Engineer in Test, Geo |
Member | Sean Carroll | Engineering Manager, Source Code |
Related Links
Last modified September 1, 2023: Mark all active working groups (
e749da39
)