Disaster Recovery Working Group

The Disaster Recovery Working Group will determine what is needed to introduce a disaster recovery mechanism for GitLab.com.


Property Value
Date Restarted August 1, 2022
Date Created November 11, 2020
End Date TBD
Slack #wg_disaster-recovery (only accessible from within the company)
Google Doc Working Group Agenda (only accessible from within the company)
Issue Board Working Group Issue Board
Epic Link
Overview & Status Main Epic, Internal Handbook (more specific)


The Disaster Recovery Working Group will determine the work needed to improve the disaster recovery mechanism for GitLab SaaS Products, and the effort is necessary to build a reliable and predictable disaster recovery at the largest scale, leveraging existing tools.

Scope and Definitions

In the context of this working group:

  1. Recovery Point Objective (RPO): maximum duration of time in which data might be lost due to an incident.
  2. Recovery Time Objective (RTO): maximum duration of time that a service is unavailable due to an incident.

Exit criteria

The exit criteria and target goals for the working group are defined here in the internal handbook.

Sequence Order Of Deliverables and Exit Criteria


  1. Complete an assessment of zonal outage and identify next step iterations towards 4 hour recovery goal (Epic: gitlab.com&1900). DRI: John Jarvis
  2. Improve node snapshot capabilities DRI: John Jarvis
  3. Define a medium to long term strategy for DR capabilities for GitLab Dedicated and Cells via Geo. DRI: Sampath Ranasinghe


Roles and Responsibilities

Working Group Role Person Title
Executive Stakeholder Jörg Heilig CTO
Facilitator/DRI Andras Horvath Engineering Manager, Gitaly
Product Management DRI Mark Wood Senior Product Manager, Gitaly
Member Gerardo Lopez-Fernandez Engineering Fellow, Infrastructure
Member Chun Du Director of Engineering, Enablement
Member Juan Silva Fullstack Engineering Manager, Geo
Member Sampath Ranasinghe Senior Product Manager, Geo
Member John Jarvis Staff SRE, Infrastructure
Member Michele Bursi Engineering Manager, Delivery
Member Sami Hiltunen Senior Backend Engineer, Gitaly
Member Joshua Lambert Director of Product Management, Enablement
Member Steve Azzopardi Staff SRE, Infrastructure.
Member Fabian Zimmer Director of Product Management, SaaS Platforms
Member Nick Westbury Senior Software Engineer in Test, Geo
Member Sean Carroll Engineering Manager, Source Code
Last modified September 1, 2023: Mark all active working groups (e749da39)