Database Framework Group
Vision
Developing solutions for scalability, application performance, data growth and
developer enablement especially where it concerns interactions with the
database.
Mission
Focusing on the database, our mission is to provide solutions that allow us to
scale to our customer’s demands. To provide tooling to proactively identify
performance bottlenecks to inform developers early in the development lifecycle.
To increase the number of database maintainers and provide database best
practices to the community contributors and development teams within GitLab.
Team Members
The following people are permanent members of the Database Team:
Stable Counterparts
The following members of other functional teams are our stable counterparts:
Stable Counterparts to other teams
The Database Group is often called upon to provide consulting to other groups.
To more efficiently support these requests we have created this stable counterparts table.
Meetings
Whenever possible, we prefer to communicate asynchronously using issues, merge
requests, and Slack. However, face-to-face meetings are useful to establish
personal connection and to address items that would be more efficiently
discussed synchronously such as blockers.
- Database Group Sync every Tuesday and Thursday at 1:00 PM UTC
- Tuesdays - we start with any
~infradev
issues requiring reviews, then we
focus on weekly priorities.
- Thursdays - are optional and open agenda. Anyone can bring topics to the
team to discuss. Typically we reserve the first Thursday after the milestone
closes to hold a synchronous retrospective.
- Database Office Hours
(internal link); YouTube recordings
- Wednesdays, 3:30pm UTC (bi-weekly)
- (APAC) Thursdays, 3:30am UTC (bi-weekly, alternating)
Work
We follow the GitLab engineering workflow
guidelines. To bring an issue to our attention please create an issue in the
relevant project. Add the ~"group::database"
label along with any other
relevant labels. If it is an urgent issue, please reach out to the Product
Manager or Engineering Manager listed in the Stable Counterparts
section above.
What we do
The team is responsible for the PostgreSQL application interactions to enable
high performance queries while offering features to support scalability and
strengthen availability. PostgreSQL is the heart of Rails application, and
there is no shortage of work to make GitLab more performant, scalable, and
highly available from database perspective. Some of the current priorities
include implementing partitioning to improve query performance and creating
tooling to enable development teams to implement their own partitioning
strategies more easily. We are working on tools that will help developers
“shift left” in their migration testing prior to deployment. We are always
looking for ways to continuously care for the performance of our databsae and
improve our developer documentation. For more in-depth details of what we are
working on please review our Roadmap section below.
In order to follow what the database group is currently working on, we recommend
watching our group’s kickoff presentations for new milestones
and the respective milestone planning issues.
Activity Log
Since end of 2021, we maintain an activity log to keep
track of past projects and outcomes.
Planning
We use a planning issue
to discuss priorities and commitments for the milestone. This happens largely
asynchronously, but when we do need to discuss synchronously we discuss during
the Tuesday team meeting timeslot.
Issue Weights
The database group is experimenting with using expected merge request count as
an issue weight. Before each milestone starts, we’ll ping each assigned issue
without a weight and ask folks to add weights to them.
We decided to use merge request count as an issue weight for a few reasons:
- The process encourages folks to consider ahead how an issue could be broken
down more and enumerate it in advance
- It’s easy to describe and learn, making it easier for the team to come to a
shared understanding
- Merge request rate is one of the main ways our team is measured
Process for weighting Issues
-
With an emphasis towards smaller more iterative changes rather than large
changes that may take longer to review and merge, consider how many merge
requests could this be broken into.
-
Add a comment enumerating the expected merge requests. For example:
Just one merge request to documentation
One to gitlab for database changes, one for new functionality, one for
documentation changes, and one to omnibus
-
Add the count as a weight. For example, if you think there could be one to
gitlab for database changes, one for new functionality, one for documentation
changes, and one to omnibus - you would assign /weight 4
Timeline for implementation
15.4 - 15.7: We’ll ping each issue in the milestone without a weight and ask
folks to add one to collect data
15.8 +: TBD
Triage rotation
We have a fairly simple triage rotation. Each week one team member is dedicated
to triaging incoming issues for the database group. This allows for the rest of
the team to focus on their current priorities with fewer interruptions. Each
week, a bot will file an issue that gets automatically assigned to next team
member in the rotation. We order the triage rotation by alpha-order based on
first name to keep it very simple. If a team member is on PTO the week they are
assigned, the issue will be re-assigned to the next person.
Issues needing triage can come in through many different paths. Some common
areas to monitor while on triage:
- Newer issues (< 7 days old) with the
~database
label that are not assigned
to a group. Example search
- Newer issues that were assigned
~group::database
but do not have a
throughput label or ~database::triage
labels. Example search
- Newer issues that were assigned
~database::triage
and have not previously
been reviewed
- When we get pinged on the #g_database slack channel for assistance
When the triage team member discovers an issue requiring team attention some of
the possible outcomes are:
- Directly address the issue if it is a simple fix
- Direct to our customer support counterparts as appropriate
- Add the
~database::triage
label and review during team sync meeting
- Add a milestone and ping the manager, or label the issue
~workflow::scheduling
- Close as duplicate and link to the duplicate issue
The goal is to keep the number of issues for triage low and manageable.
Tip: In order to remove closed issues from the triage board, use this search
and edit multiple issues at once to remove the ~database::triage
label.
Boards
Database by Milestone
The Milestone board gives us a “big picture” view of issues planned in each
milestone.
Database: Build · Boards · GitLab.org · GitLab The build board
gives you an overview of the current state of work for group::database
. These
issues have already gone through validation and are on the Product Development Build Track. Issues are added
to this board by adding the current active milestone and group::database
labels. Issues in the workflow::ready for development
column are ordered in
priority order (top down). Team members use this column to select the next item
to work on.
Database: Validation
The validation board is a queue for incoming issues for the Product Manager to
review. A common scenario for the Database Team validation board is when an
issue is created that requires further definition before it can be prioritized.
The issue typically states a big picture idea but is not yet detailed enough to
take action. The Database Team will then go through a refinement process to
break down the issue into actionable steps, create exit criteria and prioritize
against ongoing efforts. If an issue becomes too large, it will be promoted to
an epic and small sub-issues will be created.
Database: Triage
The triage board is for incoming issues that require further investigation for
team assignment, prioritization, previously existing issues, etc. Within the
Database Group we have implemented a weekly triage rotation where one team
member is responsible for monitoring this board for timely responses.
Say/Do Ratio
We use the ~Deliverable
label to track our Say/Do ratio. At the beginning of
each milestone, during a Database Group Weekly meeting, we review the issues and
determine those issues we are confident we can deliver within the milestone.
The issue will be marked with the ~Deliverable
label. At the end of the
milestone the successfully completed issues with the ~Deliverable
label are
tracked in two places. We have a dashboard in Tableau that will calculate how
many were delivered within the milestone and account for issues that were moved.
Additionally, our milestone retro issue lists all of the ~Deliverable
issues
shipped along with those that missed the milestone.
Roadmap
The Database Group
Roadmap
gives a view of what is currently in flight as well as projects that have been
prioritized for the next 3+ months.
Weekly Team Updates
The enablement section is using status issues in order to provide regular status
updates. Each week, the team’s engineering manager posts general announcments,
and members of the team post updates on their in progress projects.
These issues can be found here (internal).
Documentation
We document our insights, road maps and other relevant material in this section.
- Database Lexicon - terms and definitions relating to our Database
- Database Strategy: Guidance for proposed database changes
- On table partitioning (February 2020)
- Postgres: Sharding with foreign data wrappers and partitioning
- Sharding GitLab by top-level namespace
- Sharding with CitusDB (April 2020)
- Table partitioning: Issue group search as an example (March 2020)
- Working with the GitLab.com database for developers
- Database schema proposals for Container Registry (September 2020)
- Workload analysis for GitLab.com (October 2020)
- Multi-database Background migrations
(October 2021)
- Enablement::Database - Performance Indicators Dashboard
- Average Query Apdex for GitLab.com
Common Links
Dashboards
This page is meant to track the discussion of different database design approaches for the Container Registry.
Background and reading material
Deduplication ratios
A docker manifest describes how a docker image comprises of multiple layers. A manifest can be identified by the layers it references and as such it can be thought of as unique throughout the registry. Multiple repositories reference the same manifest.
This is a placeholder document for our upcoming training article on query plans.
Overview
This page captures the database group’s activity and documents the outcomes, key results and some takeaways (most recent first). We’ve started doing this towards the end of 2021.
2022
March 2022
Starting in November 2021, we performed a migration file cleanup with these goals:
- Improve the performance of GitLab CI jobs.
- Remove maintenance cost for old migrations.
- Improve initialization speed for new GitLab instances.
We decided to remove all of the migration files before version 14. We implemented a script that squashed multiple migrations files into one file (init_schema
).
Overview
A central tenet of the Core Platform department is to enable other teams to be more self-sufficient in areas such as, in this case, the database. The Database Group is here to provide a helping hand in areas such as database design, query efficiency, database tooling and more. In order to be more efficient and to avoid diffusion of responsibility when providing database expertise to other groups, we are providing a list of database stable counterparts and which area of expertise is owned by each team member.
This is not a comprehensive list of all of the commonly used terms, but rather it is a list of terms that are commonly confused or conflated with other terms. In each section we will identify common phrases, define our specific usage and list external references for the term in question.
Introducing PostgreSQL table partitioning to GitLab’s database
This is a working document to discuss how we are going to introduce table partitioning to GitLab.
Motivation
The PostgreSQL database we run for GitLab.com has grown to over 5 TB in total size as of early 2020. However, the total database size is not the driver to introduce partitioning but the size of individual tables is:
We can see here that there are individual tables larger than 100 GB, some even go up in the terabytes range.
Database Strategy: Guidance for proposed database changes
GitLab is offered as a Single Application with a Single data-store. This handbook entry is meant as guidance for when you encounter a situation where you are considering changes or additions to our data-store architecture. For information on tooling, migrations, debugging and best practices please read the Database guides section in GitLab Docs.
Requirement
When you propose any database additions, updates or deletions it is required that you have participated in a Database Review prior to deployment (best early in development).
This is a placeholder document for our upcoming training article on batched background migrations.
This is a placeholder document for our upcoming training article on database review.
Background migration design for multiple databases
This is a working document to specify the design of background migration support for multiple databases.
Motivation
At GitLab, we rely heavily on background processing when migrating large volumes of data. This is not only important for typical data fixes related to application logic, but also as an underpinning for future database objectives, such as partitioning and schema redesign.
At the same time, we are decomposing the GitLab application database into multiple databases to scale GitLab. This effort has an impact on the entire application, requiring substantial changes to implement the desired design.
Database Partitioning: Issue group search
We have motivated database partitioning by looking at a specific example: Group-based issue search.
This type of search allows to find specific issues within a GitLab group (example for gitlab-org
group). We can apply all kinds of filters to the search, for example by filtering on milestone, author or a free-text search.
This document summarizes findings from exploring database partitioning for issue group search on a production-size dataset with PostgreSQL 11.
PostgreSQL 11 sharding with foreign data wrappers and partitioning
This document captures our exploratory testing around using foreign data wrappers in combination with partitioning. The idea is to implement partitions as foreign tables and have other PostgreSQL clusters act as shards and hold a subset of the data.
Background
With PostgreSQL 11 declarative partitioning, we can slize tables horizontally. That is, we keep working with the top-level table as usual but underneath we organize the data in multiple partitions. Those can be thought of as regular tables that are being attached to the top-level table (much like table inheritance in PostgreSQL).
PostgreSQL yearly upgrade cadence
Starting with GitLab 16.0, we follow a yearly upgrade cadence for PostgreSQL:
-
With every major GitLab version, we are increasing the minimum required PostgreSQL version to the next major version. Some examples:
- In GitLab 17.0, PostgreSQL 14 will become the minimum supported PostgreSQL version.
- In GitLab 18.0, PostgreSQL 16 will become the minimum supported PostgreSQL version.
-
We will be announcing the deprecation of the current minimum PostgreSQL version one year in advance, with the release of each major version of GitLab:
Sharding GitLab by top-level namespace
This document summarizes the idea of sharding GitLab by top-level namespace. It is meant to provide an idea of how product features would have to change with namespace sharding and to highlight the difficulties and complexities we anticipate with this approach, as well as touching on implementation details and known unknowns.
Nomenclature - what is a namespace?
Let’s start off by defining common nomenclature.
Sharding GitLab with CitusDB
This is a working document to outline the decision making process with respect to using CitusDB as a database sharding solution for GitLab on GitLab.com.
We were exploring the Citus Community offering as part of our efforts to explore CitusDB as a sharding solution. Citus Community is licensed under the GNU Affero General Public License v3.0 (GNU AGPLv3). GNU AGPLv3 is listed in our handbook as an unacceptable license requiring legal approval for use.
This is a placeholder article on how indexes impact performance on gitlab.com
GitLab.com is powered by a large PostgreSQL database (“the database” in this doc) which is often used as a point of reference in terms of scale - after all, this is the largest installation of GitLab we have access to.
From a development perspective, it is often necessary to gather statistics and other insights from this database - for example to provide insights for query optimization during database review or when we need more insight into data distribution to inform a product or development decision.
Workload Analysis for GitLab.com
This document discusses several approaches to understand the database workload for GitLab.com better. It aims to provide a few more perspectives on database workload, in addition to already existing monitoring solutions.
Index bloat
In previous studies, we’ve established that GitLab.com suffers a lot from bloat in btree indexes. That is, over time, some indexes tend to grow a lot beyond their ideal size - they take up more space than needed and become less efficient over time. The ideal size for an index is its most compact representation. This is the case when the index is built freshly, but regular updates to the index cannot maintain this compact representation over time in many cases.