Data Access Sub Department

Vision

The Data Access sub-department is responsible for the sustainability and availability of access to GitLab’s user data, in alignment with customer needs and GitLab’s business objectives.

The scope of user data includes Git, PostgreSQL, ClickHouse, Redis, Object Storage and the development of a scalable backup system for all GitLab deployments.

For all GitLab deployments:

We design, operate and evolve GitLab’s data storage architecture and interfaces, or provide assistance to those responsible.
We guide feature owners in reaching business goals safely, throughout the feature life cycle.
We aid customers directly in incidents or escalations, and indirectly by innovating to meet their needs.

It is the job of each Data Access team to hold feature owners accountable for responsible access patterns and to thereby ensure the stability of our shared data storage systems. This is an active process and requires building relationships for collaboration, guiding through paved paths, and providing tools and knowledge Team Members can use and build on.

About sustainability and availability

Here, sustainability means long-term maintainability, efficiency, and scalability of our data storage systems and the software architecture that uses it. Good sustainability requires good up-front planning as well as continued adaptation as features, business goals and infrastructure evolve. Effects of new additions and changes must be considered in the context of the entire GitLab application and the storage infrastructure.

Availability means that critical user journeys continuously provide great user experience, during normal operations as well as state transitions such as migrations or upgrades. We must design our architecture, processes and tools such that they minimize interruptions and quality degradations.

Achieving the vision

What we do

Own and drive GitLab’s relevant sustainability goals end to end, holding each other and feature owners accountable.
Measure key metrics of data scalability and access patterns in their services, track their changes and relations to breaking points.
Publish tools to attribute usage of shared resources to their sources (for example, tie growth of a metric or a database query to a given product feature).
Define the “paved paths” (good patterns to follow when storing and accessing data) through documentation, consultation, processes, and frameworks.
Actively detect and prevent non-scalable patterns from entering GitLab as early in the development cycle as possible, through processes and automated tooling.
Drive the work to make existing patterns sustainable, as they are discovered.
Innovate, test, deploy, and migrate to infrastructure changes that contribute to long-term sustainability.
Build defense-in-depth technologies to keep storage services available (like loadshedding, request isolation).
Collaborate with other Data Access teams closely. Share knowledge, ideas, concepts and best practices to foster innovation, and deliver consistent solutions to customers.
Measure the impact of our actions, set targets, and report on progress.

FY26 goals

(in alignment with GitLab’s product principles and the [INTERNAL] Three year (FY26-FY28) - Platforms strategy)

Identify historical storage architecture issues and create a mitigation plan/roadmap (epic).
Establish a framework (automation, processes, information) to ensure scalability of new launches (epic).

Principles for launches

Below is a non-exhaustive list of considerations from the perspective of new features. The responsibility to exercise good judgement remains with the domain experts AND the feature owners. (This description uses terminology from RFC2119.)

Each of these points MUST be considered for all GitLab installation types: Cells, Dedicated, SaaS, and Self-Managed:

Growth of a feature over time MUST NOT endanger the service as a whole.
Failure of a feature MUST NOT endanger the service as a whole.
Safeguards SHOULD be architectural failsafes (isolation, circuit breaker pattern etc), not reactive mechanisms.
The critical path of operating a feature MUST be fully automated. (For example, humans watching graphs and reacting is not allowed.)
Specific observability (monitoring and alerting) MUST be in place, to pinpoint and attribute sources of load and growth.
Data ownership plans MUST include the entire lifecycle of the data, including:
1. Backup and restoration plans
2. Data retention policies that are tied to business goals and consider all potential legal implications (such as PII, personally identifying information)
3. Replication plans (Geo)
4. Cost analysis
5. Compatibility with existing data management features, like export and import
Data for any user-facing feature MUST reside, and be accessed through, a data store corresponding to the feature maturity.
1. For example, a General Availability launch REQUIRES data to be stored in, and accessed through, a production-quality Infrastructure-owned data store.
Changes MUST have documented rollout plans.
The points above MUST be reconsidered each time a feature experiences a lifecycle change (launch, significant growth, change in maturity or scope, sunsetting) before the lifecycle change can take place. The responsibility to revisit belongs to feature owners, aided by Data Access experts.
Exceptions to any of the above MUST be thoroughly and permanently documented with risk assessment and business considerations, and approved by Senior Manager, Data Access or above, and the appropriate Product counterpart(s).

All Team Members

The following people are permanent members of teams that belong to the Data Access Sub-department:

Database Framework

The Database Framework team develops solutions for scalability, application performance, data growth and developer enablement especially where it concerns interactions with the database.

Name	Role
Alex Ives	Backend Engineering Manager, Database
Backend Engineer	Backend Engineer, Database
Jon Jenkins	Senior Backend Engineer, Database
Krasimir Angelov	Staff Backend Engineer, Database
Leonardo da Rosa	Backend Engineer, Database
Matt Kasa	Staff Backend Engineer, Database
Maxime Orefice	Senior Backend Engineer, Database
Prabakaran Murugesan	Senior Backend Engineer, Database
Simon Tomlinson	Staff Backend Engineer, Database

Database Operations

The Database Operations team builds, runis, and owns the entire lifecycle of the PostgreSQL database engine for GitLab.com.

Name	Role
Rick Mar	Engineering Manager, Database Reliability
Alexander Sosna	Senior Database Reliability Engineer
Ben Prescott	Site Reliability Engineer (Database Operations)
Biren Shah	Senior Database Reliability Engineer
Jon Sisson	Senior Site Reliability Engineer
Rafael Henchen	Senior Database Reliability Engineer

Durability

The Durability team is dedicated to safeguarding and securing customer data that is stored by the GitLab application and set guidelines for data access. We strive to build and maintain resilient infrastructure and improve the management of Redis, Sidekiq, and Gitaly.

Name	Role
John 'Jarv' Jarvis	Staff Site Reliability Engineer
Ahmad Sherif	Senior Site Reliability Engineer
Furhan Shabir	Senior Site Reliability Engineer
Gabriel Mazetto	Senior Backend Engineer
Ian Baum	Senior Backend Engineer
Kyle Yetter	Senior Backend Engineer
Gregorius Marco	Backend Engineer, Scalability
Matt Smiley	Staff Site Reliability Engineer, Scalability
Pravar Gauba	Site Reliability Engineer
Raynard Omongbale	Site Reliability Engineer

Gitaly

The Gitaly team builds and maintains systems to ensure Git data of GitLab instances, and GitLab.com in particular, is reliable, secure and fast.

Name	Role
John Cai	Engineering Manager, Gitaly
Divya Rani	Backend Engineer, Gitaly
Emily Chui	Senior Backend Engineer, Gitaly
Eric Ju	Senior Backend Engineer, Gitaly
James Liu	Senior Backend Engineer, Gitaly
Mustafa Bayar	Backend Engineer, Gitaly
Olivier Campeau	Backend Engineer, Gitaly
Quang-Minh Nguyen	Staff Backend Engineer, Gitaly and Tenant Scale
Sami Hiltunen	Staff Backend Engineer, Gitaly

Git

The Git team develops Git in accordance with the goals of the community and GitLab, and integrate it into our products.

Name	Role
Christian Couder	Staff Backend Engineer, Git
Justin Tobler	Senior Backend Engineer, Git
Karthik Nayak	Senior Backend Engineer, Git
Patrick Steinhardt	Acting Engineering Manager, Git
Toon Claes	Senior Backend Engineer, Git