Working with the GitLab.com database for developers
A developer’s guide to working with the GitLab.com database
GitLab.com is powered by a large PostgreSQL database (“the database” in this doc) which is often used as a point of reference in terms of scale - after all, this is the largest installation of GitLab we have access to.
From a development perspective, it is often necessary to gather statistics and other insights from this database - for example to provide insights for query optimization during database review or when we need more insight into data distribution to inform a product or development decision.
This overview is aimed at developers wanting to access the database and also explains how to get this type of access in the first place.
Types of access
There are a couple ways to directly access the database for developers:
Direct access to production
Direct database access to a production replica
This is the most straight forward type of access: You’ll have access to a psql console to an actual replica of the database (which typically serves live traffic). The console will be read-only and a 15s statement timeout must be adhered to.
How to get access: File an access-request and ask for production database console access (role: db-console
)
to receive permanent SSH access, or use Teleport to request temporary access (Recommended).
See below for setting up SSH access.
Access to the “archive” replica
In this case, you’ll have access to a psql console to the “archive” replica. This is a replica of the database which does not serve live traffic and is prepared to be used in a rather “exploratory” fashion (suitable for rather analytical queries, too). The console access is read-only and a 15 minute statement timeout applies. You can verify this by going db console and issuing show statement_timeout
command.
How to get access: File an access-request and ask for console access to the archive database in production (role: db-console-archive
). See below for setting up SSH access.
Rails console access
A Rails console can be used to access the database, too. Use with extreme care as this is typically a read-write console.
How to get access: File an access-request and ask for Rails console access in production (role: rails-console
)
to receive permanent SSH access, or use Teleport to request temporary access (Recommended).
See below for setting up SSH access.
Setup SSH configuration for direct access
In order to setup direct access, you will have to configure SSH to use the bastion instances as a proxy:
This is an example configuration for a user named joe
(replace this with your gitlab email user) to access psql and Rails consoles in gprd
:
# GCP production bastion host
Host lb-bastion.gprd.gitlab.com
User joe
Host gprd-rails
User joe-rails
StrictHostKeyChecking no
HostName console-01-sv-gprd.c.gitlab-production.internal
ProxyCommand ssh lb-bastion.gprd.gitlab.com -W %h:%p
Host gprd-psql
User joe-db
StrictHostKeyChecking no
HostName console-01-sv-gprd.c.gitlab-production.internal
ProxyCommand ssh lb-bastion.gprd.gitlab.com -W %h:%p
Host gprd-psql-archive
User joe-db-archive
StrictHostKeyChecking no
HostName console-01-sv-gprd.c.gitlab-production.internal
ProxyCommand ssh lb-bastion.gprd.gitlab.com -W %h:%p
With this setup, the user joe
could use the following commands:
ssh gprd-rails # opens a rails console
ssh gprd-psql # opens a psql console
ssh gprd-psql-archive # opens a psql console on a replica
DatabaseLabs
Use postgres.ai to work with a thin clone of the database with psql
We have access to postgres.ai which gives us the ability to work with a thin clone of the database. This has the benefit of providing a fully isolated read-write database for a single user. Thin clones are inexpensive to create and can be used and destroyed as needed. This includes direct psql access to the thin clone.
With the extended setup, you’ll be able to use a psql console on a thin clone, too. This is the most flexible way of working with the database, as this even allows you to e.g. create new tables (e.g. for intermediate results) as needed.
Query plans and optimization
In order to evaluate query performance, we have the following tools available:
- Retrieve query plans through ChatOps on Slack
- Use
#database-lab
Slack channel to work with a thin clone of the database - Use postgres.ai (currently under evaluation) to fully immerse into working with a thin clone to optimize queries
Where do I get help with setting up access or any other questions?
Please reach out to #database
on Slack and consider adding to this page afterwards.
Resources and Tips
Process
Various Product and Engineering processes are available to advocate database improvements that may not fit anywhere in the feature development roadmaps.
- Leverage the Planning as a team
According to this finding, more than 50% of known query performance issues were identified long time ago. Leverage the
planning as a team
to prioritize these issues with Product. The process is being implemented by Introducing Error Budgets to Stage Groups. - Follow the intructions in Prioritize technical decisions when collaborating with Product in release planning, which also mirrored in the Product handbook page.
- Consider Engineering Allocation for tech-debt issues, which is also mirrored in the Product handbook.
- Keep the Architecture process in mind when running into large scale fundamental design challenges.
Best practices
Common issues seen in the application codebase are N+1 queries, inappropriate use of CTEs, and read-only queries not leveraging read replicas.
- Guidance on N+1 Queries
- Use CTEs wisely
- Use read replicas when possible
Knowledge sharing
- Video recording - Techniques for researching inefficient DB queries. Highly recommended!
- Using Elasticsearch with PostgreSQL slow logs for troubleshooting.
- Slow queries chart to identify potential sub-optimal queries.
Tools
- Postgres.ai for testing with production like data.
- Query plans and optimization.
c16c2006
)