Global Search Group

The Global Search team is focused on bringing world class search functionality to GitLab.com and self-managed instances.

Vision

The Global Search Group focuses on bringing world class search functionality to GitLab.com and self-managed instances.

This page covers processes and information specific to the Global Search group. See also the Global Search and Code Search direction pages.

Mission

The group is responsible for improving and expanding upon our current global search implementations using Elasticsearch, PostgreSQL, Zoekt, and Gitaly. Areas of responsibility will include global search functionality, UI, ingestion mechanisms, optimal indexing, administrative tools, and installation mechanisms for self-managed installations.

Additionally, we build and maintain critical AI context infrastructure, including:

AI Context Abstraction Layer: A unified interface for Retrieval Augmented Generation (RAG) across multiple vector databases (Elasticsearch, OpenSearch, PostgreSQL with pgvector), enabling AI features to work regardless of underlying storage
GitLab Zoekt: GitLab’s scalable exact code search service and file-based database system, with flexible architecture supporting various AI context use cases beyond traditional search. It’s built on top of open-source code search engine Zoekt.

These systems will be fundamental to providing high-quality context for AI features via Retrieval Augmented Generation work, which includes:

Identifying and preparing new useful data for our AI-powered features in collaboration with feature teams and the AI Framework team
Storing vector embeddings of epics, issues, MRs, source code, and more
Providing retrieval APIs for those vector embeddings, metadata filtering, and ensuring permissions are enforced
Enabling fast, precise code search and context retrieval essential for AI context

This team doesn’t own custom searches for specific features, such as the “filter bar” on issues which is part of the Issue Tracking category owned by the Project Management group.

Team Members

The following team members are permanent members of the Global Search Group:

Name	Role
Changzheng Liu	Backend Engineering Manager, Global Search

Stable Counterparts

The following members of other functional teams are our stable counterparts:

Name	Role
Ashraf Khamis	Senior Technical Writer
Cleveland Bledsoe Jr	Senior Support Engineer
Brenda Nyaringita	Support Engineer(EMEA)

Shared Responsibilities

The Global Search team shares responsibilities with the AI Framework team in the area of (RAG). Specifically, we will collaborate in the data preparation stage and information retrieval stage of the RAG process.

AI Context Infrastructure and Advanced Search data stores

Global Search data stores and interfaces diagram

The Global Search team maintains several key systems that power both traditional search and AI context capabilities:

Core Infrastructure Components

Elasticsearch: Powers Advanced Search functionality with full-text search, aggregations, and vector similarity search capabilities
GitLab Zoekt: GitLab’s scalable file-based database system providing exact code search with enterprise-scale performance (48+ TiB indexed on GitLab.com). Beyond code search, Zoekt’s flexible architecture serves as a foundation for various AI context use cases
AI Context Abstraction Layer: A unified Ruby gem interface enabling RAG across multiple vector databases (Elasticsearch, OpenSearch, PostgreSQL with pgvector), ensuring AI features work regardless of underlying storage solution

These systems work together to provide comprehensive search and AI context capabilities, from traditional keyword search to sophisticated vector similarity matching for AI features.

Advanced Search as an Enabling Framework

Beyond powering GitLab’s global search functionality, Advanced Search serves as a critical framework that enables other teams across GitLab to overcome the inherent limitations of PostgreSQL for complex search and analytics use cases. Teams leverage Advanced Search infrastructure to:

Scale beyond PostgreSQL constraints: Handle large-scale text search, aggregations, and analytics that would be prohibitively expensive or slow in PostgreSQL
Enable sophisticated filtering: Support complex multi-field queries, faceted search, and advanced filtering capabilities
Power analytics and insights: Generate aggregations, statistics, and insights from large datasets without impacting primary database performance
Support AI and ML workflows: Provide vector similarity search, embeddings storage, and retrieval capabilities essential for AI features

This framework approach allows feature teams to focus on their domain expertise while leveraging battle-tested, scalable search infrastructure maintained by the Global Search team.

A note on basic search

Basic search utilizes Postgres for text searching and Gitaly for code searching. Both functionalities are significantly limited compared to with Advanced search.

Current state of Advanced Search scopes

There are many data types and search scopes already available via the Advanced Search interfaces. Below is a table that outlines the various available data types and the status of various functional elements, such as permissions, cross-group searching, and embeddings.

Data type / scope	Privacy / Permissions	Cross-namespace / cross-group searching	Keyword search	Similarity search & Embeddings	Metadata filtering
Code	Yes	Yes	Yes	In progress	Group, Project, Include/exclude archived, include/exclude forks, Language, Filename, Path, Extension
Issues	Yes	Yes	Yes	Yes	Group, Project, Status, Confidentiality, Labels, Include/exclude archived
Merge requests	Yes	Yes	Yes	No	Group, Project, Status, Include/exclude archived
Epics	Yes	Yes	Yes	No	Group, Project
Comments	Yes	Yes	Yes	No	Group, Project, Include/exclude archived
Users	Yes	Yes	Yes	No	Group, Project
Commits	Yes	Yes	Yes	No	Include/exclude archived
Milestones	Yes	Yes	Yes	No	Group, Project, Include/exclude archived
Project	Yes	Yes	Yes	No	Group
Wiki	Yes	Yes	Yes	No	Group, Project

Meetings

Whenever possible, we prefer to communicate asynchronously using issues, merge requests, and Slack. However, face-to-face meetings are useful for establishing a personal connection and addressing items that would be more efficiently discussed synchronously, such as blockers.

The Global Search Group meets weekly on Tuesdays at 14:00 UTC.
The Global Search Group also has an Open Discussion Hour on Thursdays at 12:30 UTC

Work

We follow the general workflow and principles defined in Product Development Flow and Engineering Workflow. To bring an issue to our attention, please create an issue in the relevant project. Add the ~"group::global search" label and any other suitable labels. If it is an urgent issue, please reach out to the Product Manager or Engineering Manager listed in the Stable Counterparts section above.

Below are a few guidelines the team follows in the day-to-day work.

We use asynchronous communication with each other and with other GitLab teams via GitLab, Slack, Google Docs, etc.
We have weekly team meetings, 1-on-1 meetings, and virtual happy hours via Zoom to discuss various topics and create team bonding.
We encourage all backend engineers in our team to have their changes reviewed by someone else in our group. It’s great for knowledge sharing.
We organize our tasks under Epics and Issues. The Product Manager and Engineering Manager go through the backlog at the planning phase of each release and put issues into the next one or two milestones. The issues on the milestone board are sorted based on priority. The higher priority issues are placed on the top.
We apply the Deliverable label to the issues that we intend to close in a milestone before the milestone starts. Issues added during a milestone should not have the Deliverable label applied. We review these issues in the middle of the milestone, usually the first week of each month. We will remove the Deliverable label from the issues that are not likely to make it into the release.
We apply the Stretch label to the issues that we intend to start during a milestone but are not committing to closing.
We work with the UX team for features that need their design input by labeling the issues with a UX workflow label and adding the corresponding UX team counterpart as the assignee. We use workflow::problem validation and workflow::solution validation for user research and workflow::design for UI design and prototyping. Once the design is finished, workflow::ready for development label will be added as an indicator that development can start. For minor UX/UI changes, we contact our UX counterpart or the Product Design Manager to request a review for fast iterations.
We work with the Developer Experience team for issues that require input from a testing perspective by creating an RFH.
We work with the Technical Writing team for issues that need documentation change by labeling the issues with documentation and adding our counterpart in the Technical Writing team as assignee. Our technical writer helps us update the corresponding document. The documentation change normally happens together with the code change.
We work with our stable counterpart in the Security team for issues that need input from a security perspective. We suggest using team planning issues, for example, this one, for communication.
We work with the Support Engineering team by collaborating on issues directly. We invite our counterpart in the Support Engineering team to our team meeting every month to have direct communication.
When team members are ready for their next tasks, they will pick an issue from the milestone board and become the issue owner by assigning the issue to themselves. Team members should prioritize issues with the Deliverable label. The issue owner will be responsible for finding the solution to the issue. They can propose a solution by opening a Merge Request. They can also break down the issue into smaller sub-issues if it makes sense to take an iterative approach.
Before going out of office for an extended time, assign items still in review to the Engineering Manager. The Engineering Manager can reassign as needed.
Whenever a team member reviews an author’s work that is out of office for an extended time, they are welcome to complete the changes requested if they deem themselves comfortable with the remainder of the work.
We review and prioritize bugs every week. It is common for bugs to represent the problem without identifying the impact. Because the Product Management and QA share the responsibility of assessing every bug for priority, severity, and details. Severity uses an approximation of the Risk Matrix to identify potential risk and frequency. Priority is based on total impact over time. Occasionally, something of a lower priority/ severity will be added to a milestone when it relates to work currently scheduled.
1. Review all new bugs for content, priority, severity, and milestones
2. Review any bugs missing priority or severity
3. Prioritize bugs for the current milestone. 10% of scheduled work should be focused on bugs
4. Schedule bugs for future milestones based on capacity, severity, priority, and relationship to any scheduled work

Breaking changes process

Before a major milestone starts, we prepare an epic with all the breaking change issues linked. As usual, we work to get approvals but keep the MR in draft to prevent it from merging before the major milestone. If an MR is independent, we can have the master as a target branch. If not, we can have a sequence of MRs with target branches set to each other. As soon as the first one merges, the next will automatically target master.

Every MR that was created before the breaking change milestone should have this or a similar warning in the description: :warning: This MR must be kept as a draft and cannot be merged until **DATE** :warning:

Bugfix backport process

We review the bugfix merge requests every week. To facilitate this process, we have created scoped labels: backport::required, backport::skip, and backport::complete.

The backport::skip label will be added to merge requests if no backport is needed.
The backport::required label will be added to the merge requests that need to be backported to a previous release in the initial review. The DRI will follow the patch release process to backport the fix to a previous release. Once the backport is done, the backport::complete label will be added to indicate the whole process is complete.

Advanced Global Search Rollout on GitLab.com

The team has been actively working on enabling Elasticsearch powered Advanced Search on GitLab.com. Based on our analysis, we set our first target to roll this feature out for all the paid groups on GitLab.com. You can find more details about the timeline and progress in the links below.

Severity Labels for Search Issues (`~advanced search`, `~global search`)

Type of Operation	`~severity::1` - Blocker	`~severity::2` - Critical	`~severity::3` - Major	`~severity::4` - Low
Recall Record, Global	Above 10 seconds to timing out	Between 7 and 10 seconds	Between 4 and 7 seconds	Between 2 and 4 seconds
Time until inserted record is recallable	Above 15 minutes	Between 15 and 10 minutes	Between 10 and 5 minutes	Between 3 and 5 minutes

The two types of operations we detail severity metrics for above are:

Recall Record, Global: This is the time it takes to recall a record using a globally scoped search of GitLab.com. Records could be entities such as projects, users, groups, etc.
Time until inserted record is recallable: This is the elapsed time between adding a new record and having that new record be recallable via a search. This process depends on many underlying technologies such as the Go indexer, Sidekiq queues, and the Elasticsearch database.

Weighting for Search Issues

We use the Fibonacci rating system to assign weights to Search issues. Below are a few guidelines when setting issue weight:

Issues that include ~backend and ~frontend work should have the weights added for a total weight representative of the work effort.
Spike issues are assigned a weight to help timebox the effort.
Bugs will not be given a weight.
Any issue weighted over 5 should be broken down into smaller iterative steps if the issue does not contain ~backend and ~frontend work.

Weight	Description
0	No effort or trivial effort (example: Documentation typo or Feature Flag Rollout)
1	Low effort (No Database migrations or Advanced Search migrations)
2	Low-Medium effort
3	Medium effort
5	High effort

MR reviews

We have the following guidelines for doing reviews on Global Search Team MRs:

The MR author is responsible for deciding if the initial or maintainer reviews should be done by a Global Search Team member and can indicate that in a comment or by assigning the reviewers.
Draft status indicates that the MR is not ready to be merged, but the author could decide to assign a reviewer while in draft mode. Unless a review is urgent, the author should wait for the pipeline to pass before assigning a reviewer.
We use Conventional Comments to communicate effectively in review comments.
The merge request author resolves only the threads they feel they have fully addressed and all discussions have been closed, anything else is resolved by the reviewer. When a merge request has many threads, it is helpful for the reviewer to go back to open threads to pick up where the previous discussions were left off.

Oncall escalation coverage

As the Global Search Team requires special domain knowledge, such as Elasticsearch, we borrow team members with this domain knowledge from other groups to cover the on-call escalation when we are understaffing, especially during the holiday seasons. In general, we will follow the Tier 2 On-Call Program for escalations requiring domain expertise. The Elasticsearch domain experts, identified by domain_expertise on their profile, may be contacted when SRE and Tier 2 on-call engineers cannot resolve the production incidents. We don’t expect the domain experts to work outside their normal working hours. In case of emergency, we will follow the rules and best practices outlined in our Incident Management handbook. To assist team members in catching up on the latest development status and resolving potential incidents, we have created a Global Search Incident Management document as a reference.

Onboard domain experts from other groups to cover production incident escalation

When onboarding domain experts from other groups to help cover production incident escalation, we may consider the following actions:

Suggest the team member add elasticsearch as their domain_expertise in their team member profile
Add the team member to the Slack group global-search-team which can be used by SREs and other on-call engineers to contact in case of emergency
Create the access request for the team member to grant them access permissions to Elasticsearch cluster
Schedule walk-through sessions with the team member to go over the latest architecture and development status

Offboard domain experts from production incident escalation coverage

Remove the team member from the Slack group global-search-team
Revoke the team member’s access permission of Elasticsearch cluster

Common Links

Global Search Team Milestone Board
Global Search Team Workflow Board
Global Search Team Epics
Global Search team slack channel (internal) #g_global_search
Global Search Roadmap
Bug Review Board

JTBD

We utilize the Jobs to be Done (JTBD) framework to better understand our customers’ and users’ needs. You can view the current list of our JTBD here.

Performance Testing

We are exploring Rally for performance testing the Elasticsearch cluster. Workload data is determined using Kibana and stored in a Google Sheet (internal)

Resources

Documentations

Search and Advanced Search

AI Context Infrastructure

Zoekt Design Document - Comprehensive architecture and implementation details
GDK Zoekt Setup Instructions
AI Context Abstraction Layer Design Document - Unified RAG interface architecture
AI Context Abstraction Layer Source Code - Ruby gem implementation

Blog Posts

Product Demos

Advanced Global Search Rollout on GitLab.com

Steps and Enhancements 2019-11-05: Search security rapid action started. Advanced Global Search went …

Global Search - JTBD

The jobs-to-be-done that the Global Search group is solving for.

Last modified May 22, 2026: Transition Software Engineer in Test (SET) to Backend Engineer, Quality Department and Engineering Productivity departments clean-up (8af74dc3)

View page source - Edit this page - please contribute.

Global Search Group

Vision

Mission

Team Members

Stable Counterparts

Shared Responsibilities

AI Context Infrastructure and Advanced Search data stores

Core Infrastructure Components

Advanced Search as an Enabling Framework

A note on basic search

Current state of Advanced Search scopes

Meetings

Work

Breaking changes process

Bugfix backport process

Advanced Global Search Rollout on GitLab.com

Severity Labels for Search Issues (~advanced search, ~global search)

Weighting for Search Issues

MR reviews

Oncall escalation coverage

Onboard domain experts from other groups to cover production incident escalation

Offboard domain experts from production incident escalation coverage

Common Links

JTBD

Performance Testing

Resources

Documentations

Search and Advanced Search

AI Context Infrastructure

Blog Posts

Product Demos

Advanced Global Search Rollout on GitLab.com

Global Search - JTBD

Severity Labels for Search Issues (`~advanced search`, `~global search`)