Global Search Group
The Global Search team is focused on bringing world class search functionality to GitLab.com and self-managed instances.
Vision
The Global Search Group focuses on bringing world class search functionality to GitLab.com and self-managed instances.
This page covers processes and information specific to the Global Search group. See also the Global Search and Code Search direction pages.
Mission
The group is responsible for improving and expanding upon our current global search implementations using Elasticsearch, PostgreSQL, and Gitaly. Areas of responsibility will include global search functionality, UI, ingestion mechanisms, optimal indexing, administrative tools, and installation mechanisms for self-managed installations.
Additionally, we will support AI features via Retrieval Augmented Generation work which includes:
- Identifying and preparing new useful data for our AI-powered features in collaboration with feature teams and the AI Framework team
- Storing vector embeddings of epics, issues, MRs, source code, and more
- Providing retrieval APIs for those vector embeddings, metadata filtering, and ensuring permissions are enforced
This team doesn’t own custom searches for specific features, such as the “filter bar” on issues which is part of the Issue Tracking category owned by the Project Management group.
Team Members
The following team members are permanent members of the Global Search Group:
Stable Counterparts
The following members of other functional teams are our stable counterparts:
Shared Responsibilities
The Global Search team shares responsibilites with the AI Framework team in the area of Retrieval Augmented Generation (RAG). Specifically, we will collaborate in the data preparation stage and information retrieval stage of the RAG process.
Meetings
Whenever possible, we prefer to communicate asynchronously using issues, merge requests, and Slack. However, face-to-face meetings are useful for establishing a personal connection and addressing items that would be more efficiently discussed synchronously, such as blockers.
- The Global Search Group meets weekly on Tuesdays at 14:00 UTC.
- The Global Search Group also has an Open Discussion Hour on Thursdays at 12:30 UTC
Work
We follow the general workflow and principles defined in Product Development Flow and Engineering Workflow. To bring an issue to our attention, please create an issue in the relevant project. Add the ~"group::global search"
label and any other suitable labels. If it is an urgent issue, please reach out to the Product Manager or Engineering Manager listed in the Stable Counterparts section above.
Below are a few guidelines the team follows in the day-to-day work.
- We use asynchronous communication with each other and with other GitLab teams via GitLab, Slack, Google Docs, etc.
- We have weekly team meetings, 1-on-1 meetings, and virtual happy hours via Zoom to discuss various topics and create team bonding.
- We encourage all backend engineers in our team to have their changes reviewed by someone else in our group. It’s great for knowledge sharing.
- We organize our tasks under Epics and Issues. The Product Manager and Engineering Manager go through the backlog at the planning phase of each release and put issues into the next one or two milestones. The issues on the milestone board are sorted based on priority. The higher priority issues are placed on the top.
- We apply the Deliverable label to the issues that we intend to close in a milestone before the milestone starts. Issues added during a milestone should not have the Deliverable label applied. We review these issues in the middle of the milestone, usually the first week of each month. We will remove the Deliverable label from the issues that are not likely to make it into the release.
- We apply the Stretch label to the issues that we intend to start during a milestone but are not committing to closing.
- We work with the UX team for features that need their design input by labeling the issues with a UX workflow label and adding the corresponding UX team counterpart as the assignee. We use
workflow::problem validation
and workflow::solution
validation for user research and workflow::design
for UI design and prototyping. Once the design is finished, workflow::ready for development
label will be added as an indicator that development can start. For minor UX/UI changes, we contact our UX counterpart or the Product Design Manager to request a review for fast iterations.
- We work with the Quality team for issues that require input from a testing perspective by labeling the issues with
workflow::planning breakdown
and adding the SET counterpart as an assignee. Once SET reviews the issue, they acknowledge back with the label quad-planning::complete-action
or quad-planning::complete-no-action
- We work with the Technical Writing team for issues that need documentation change by labeling the issues with
documentation
and adding our counterpart in the Technical Writing team as assignee. Our technical writer helps us update the corresponding document. The documentation change normally happens together with the code change.
- We work with our stable counterpart in the Security team for issues that need input from a security perspective. We suggest using team planning issues, for example, this one, for communication.
- We work with the Support Engineering team by collaborating on issues directly. We invite our counterpart in the Support Engineering team to our team meeting every month to have direct communication.
- When team members are ready for their next tasks, they will pick an issue from the milestone board and become the issue owner by assigning the issue to themselves. Team members should prioritize issues with the Deliverable label. The issue owner will be responsible for finding the solution to the issue. They can propose a solution by opening a Merge Request. They can also break down the issue into smaller sub-issues if it makes sense to take an iterative approach.
- Before going out of office for an extended time, assign items still in review to the Engineering Manager. The Engineering Manager can reassign as needed.
- Whenever a team member reviews an author’s work that is out of office for an extended time, they are welcome to complete the changes requested if they deem themselves comfortable with the remainder of the work.
- We review and prioritize bugs every week. It is common for bugs to represent the problem without identifying the impact. Because the Product Management and QA share the responsibility of assessing every bug for priority, severity, and details. Severity uses an approximation of the Risk Matrix to identify potential risk and frequency. Priority is based on total impact over time. Occasionally, something of a lower priority/ severity will be added to a milestone when it relates to work currently scheduled.
- Review all new bugs for content, priority, severity, and milestones
- Review any bugs missing priority or severity
- Prioritize bugs for the current milestone. 10% of scheduled work should be focused on bugs
- Schedule bugs for future milestones based on capacity, severity, priority, and relationship to any scheduled work
Breaking changes process
Before a major milestone starts, we prepare an epic with all the breaking change issues linked. As usual, we work to get approvals but keep the MR in draft to prevent it from merging before the major milestone. If an MR is independent, we can have the master
as a target branch. If not, we can have a sequence of MRs with target branches set to each other. As soon as the first one merges, the next will automatically target master
.
Every MR that was created before the breaking change milestone should have this or a similar warning in the description: :warning: This MR must be kept as a draft and cannot be merged until **DATE** :warning:
Bugfix backport process
We review the bugfix merge requests every week. To facilitate this process, we have created scoped labels: backport::required
, backport::skip
, and backport::complete
.
- The
backport::skip
label will be added to merge requests if no backport is needed.
- The
backport::required
label will be added to the merge requests that need to be backported to a previous release in the initial review. The DRI will follow the patch release process to backport the fix to a previous release. Once the backport is done, the backport::complete
label will be added to indicate the whole process is complete.
Advanced Global Search Rollout on GitLab.com
The team has been actively working on enabling Elasticsearch powered Advanced Search on GitLab.com. Based on our analysis, we set our first target to roll this feature out for all the paid groups on GitLab.com. You can find more details about the timeline and progress in the links below.
Severity Labels for Search Issues (~advanced search
, ~global search
)
Type of Operation |
~severity::1 - Blocker |
~severity::2 - Critical |
~severity::3 - Major |
~severity::4 - Low |
Recall Record, Global |
Above 10 seconds to timing out |
Between 7 and 10 seconds |
Between 4 and 7 seconds |
Between 2 and 4 seconds |
Time until inserted record is recallable |
Above 15 minutes |
Between 15 and 10 minutes |
Between 10 and 5 minutes |
Between 3 and 5 minutes |
The two types of operations we detail severity metrics for above are:
- Recall Record, Global: This is the time it takes to recall a record using a globally scoped search of GitLab.com. Records could be entities such as projects, users, groups, etc.
- Time until inserted record is recallable: This is the elapsed time between adding a new record and having that new record be recallable via a search. This process depends on many underlying technologies such as the Go indexer, Sidekiq queues, and the Elasticsearch database.
Weighting for Search Issues
We use the Fibonacci rating system to assign weights to Search issues. Below are a few guidelines when setting issue weight:
- Issues that include
~backend
and ~frontend
work should have the weights added for a total weight representative of the work effort.
- Spike issues are assigned a weight to help timebox the effort.
- Bugs will not be given a weight.
- Any issue weighted over 5 should be broken down into smaller iterative steps if the issue does not contain
~backend
and ~frontend
work.
Weight |
Description |
0 |
No effort or trivial effort (example: Documentation typo or Feature Flag Rollout) |
1 |
Low effort (No Database migrations or Advanced Search migrations) |
2 |
Low-Medium effort |
3 |
Medium effort |
5 |
High effort |
MR reviews
We have the following guidelines for doing reviews on Global Search Team MRs:
- The MR author is responsible for deciding if the initial or maintainer reviews should be done by a Global Search Team member and can indicate that in a comment or by assigning the reviewers.
- Draft status indicates that the MR is not ready to be merged, but the author could decide to assign a reviewer while in draft mode. Unless a review is urgent, the author should wait for the pipeline to pass before assigning a reviewer.
- We use Conventional Comments to communicate effectively in review comments.
- The merge request author resolves only the threads they feel they have fully addressed and all discussions have been closed, anything else is resolved by the reviewer. When a merge request has many threads, it is helpful for the reviewer to go back to open threads to pick up where the previous discussions were left off.
Oncall escalation coverage
As the Global Search Team requires special domain knowledge, such as Elasticsearch, we borrow team members with this domain knowledge from other groups to cover the on-call escalation when we are understaffing, especially during the holiday seasons. In general, we will follow the dev on-call process. The Elasticsearch domain experts, identified by domain_expertise on their profile, may be contacted when SRE and dev on-call engineers cannot resolve the production incidents. We don’t expect the domain experts to work outside their normal working hours. In case of emergency, we will follow the rules and best practices outlined in our Incident Management handbook. To assist team members in catching up on the latest development status and resolving potential incidents, we have created a Global Search Incident Management document as a reference.
Onboard domain experts from other groups to cover production incident escalation
When onboarding domain experts from other groups to help cover production incident escalation, we may consider the following actions:
- Suggest the team member add
elasticsearch
as their domain_expertise
in their team member profile
- Add the team member to the Slack group global-search-team which can be used by SREs and other on-call engineers to contact in case of emergency
- Create the access request for the team member to grant them access permissions to Elasticsearch cluster
- Schedule walk-through sessions with the team member to go over the latest architecture and development status
Offboard domain experts from production incident escalation coverage
- Remove the team member from the Slack group global-search-team
- Revoke the team member’s access permission of Elasticsearch cluster
Common Links
JTBD
We utilize the Jobs to be Done (JTBD) framework to better understand our customers’ and users’ needs. You can view the current list of our JTBD here.
We are exploring Rally for performance testing the Elasticsearch cluster. Workload data is determined using Kibana and stored in a Google Sheet (internal)
Resources
Documentations
Blog Posts
Product Demos
Dashboards
The jobs-to-be-done that the Global Search group is solving for.