Verify Stage

The verify development group handbook page.

Vision

We enable global software organizations and teams to make great decisions with smart feedback loops by delivering speedy, reliable pipelines in a comprehensive CI platform that embodies Operations for All.

Technical Roadmap

FY25 to FY26

There are 3 Core Themes that continue to be a focus for GitLab CI:

  1. Scalability
  2. Availability
  3. Sustainability & Maintainability

We aim to drive enhancements that benefit GitLab.com, Self-Managed and Dedicated customers.

  1. With accelerated efforts, we aim to complete CI Data Partitioning for our top 6 tables in the CI Database in FY25. Not only does this impact CI, this addresses a critical availability need impacting all of gitlab.com. This is a cross-stage effort with backend engineers across Verify in order to parallelize the ongoing effort. (ETA: Q3)
  2. CI Data Retention: with the continued growth in our CI database tables, Verify and Infrastructure teams will work on removing data upon analysis of disk usage. This includes removal of both table records and indexes. Engineering will also collaborate with Product to implement features that allow Self-Managed and Dedicated customers to configure their own CI data retention policies. (ETA: Analysis in Q3, Implementation in Q4 until FY26-Q1/Q2)
  3. CI Data Management (TBD): Determine the data management tools that provide the greatest flexibility to our customers to manage their own CI data. (For example, removal of old CI builds or artifacts that consume their disk usage). We will consult with Product and UX to understand the tools/feature set that are most requested. (ETA: FY26-Q3)
  4. CI Minutes / Compute Units support: Build better tooling for our Support and SRE teams when responding to incidents that require us to restore CI Minutes for customer namespaces on GitLab.com, and ensure domain knowledge is shared across the Verify and Fulfillment teams. (ETA: Q4)
  5. Pipeline creation speed improvements: benchmarking and instrumentation will be our focus in order to identify the bottlenecks and improve pipeline creation speeds. This will be critical for driving Error Budget improvements. (ETA: Ongoing efforts to FY26)
  6. Cells 1.0 Database Support: the goal is to complete the Verify dependencies by Q4.

The product roadmap outlines the expected deliverables for FY25.

3 Year Vision

  1. Continue to iterate on the Verify stage technical debt roadmap. How do we represent the highest priorities in a SSOT like an issue board?
  2. Tracking Unplanned Work and making that more visible, to account for this during Engineering capacity or headcount planning. (For example, incident response, requests for help issues, triaging questions on Slack)
  3. Pipeline speed improvements - while benchmarking and instrumentation has been a focus, we have not:
    1. Implemented distributed tracing on our CI workers
    2. Prioritized further work in CI/CD Build Speed working group
    3. Built more observability related to Pipeline speed improvements.
  4. CI Events - this is deferred from FY25 due to capacity constraints in Verify.
  5. Better onboarding for contributors who are not CI subject domain experts - this includes improvements to code readability and accessibility, or better documentation and onboarding material.

FY24

The Verify Pipeline teams focused on the following Engineering-led initiatives, in addition to our deliverables for the FY24 Yearlies:

  1. CI Data Partitioning
  2. Pipeline speed improvements - including analysis of pipeline performance
  3. Review of the data retention strategy of CI data on gitlab.com
  4. Security vulnerabilities and infradev issues related to SaaS availability
  5. S1/S2 bug burndown of Categories that do not have planned feature development for FY24.
    1. Note that this also includes the Continuous Integration category, which has the biggest backlog of bugs in Verify. While it may be considered to be “Maintenance” (no new feature development planned), this work remains critical in ensuring we keep GitLab CI performant and reliable.
    2. Pipeline Execution owns the Continuous Integration category. The team is also the DRI for CI Data Partitioning and Pipeline speed improvement efforts.

FY23

The Verify stage focused on reliability and scalability of GitLab CI, which was critical for the availability of gitlab.com. This included addressing database performance, security vulnerabilities, performance improvements and relevant technical debt. This ensured GitLab remained secure, compliant and performant, with our SaaS offering that was able to maintain SLAs of gitlab.com.

Mission

As engineers in Verify we know our customers because we are our customers, and we are constantly striving to make our platform better for everyone. We do this through iteration, dogfooding, and being involved in our open source community. We innovate, we collaborate, and we challenge assumptions to arrive at great results.

We take ownership of the things we build, with a focus on stability and availability. We do this by having a deep technical understanding of the operation and performance characteristics of our platform, and a proactive perspective to future growth.

Who we are

The Verify stage is made up of 5 groups:

  1. Verify:Pipeline Authoring

  2. Verify:Pipeline Execution

  3. Verify:Runner

  4. Verify:CI Platform

Verify:Pipeline Authoring

Name Role

Verify:Pipeline Execution

Name Role
Caroline SimpsonCaroline Simpson Fullstack Engineering Manager, Verify:Pipeline Execution
Allison BrowneAllison Browne Senior Backend Engineer, Verify:Pipeline Execution
Daniel PrauseDaniel Prause Backend Engineer
Hordur Freyr YngvasonHordur Freyr Yngvason Senior Backend Engineer, Verify:Pipeline Execution
Jose Ivan VargasJose Ivan Vargas Senior Frontend Engineer, Verify:Pipeline Execution
Max FanMax Fan Senior Backend Engineer, Verify:Pipeline Execution
Panos KanellidisPanos Kanellidis Senior Backend Engineer, Verify:Pipeline Execution
Payton BurdettePayton Burdette Senior Frontend Engineer, Verify:Pipeline Execution
Vlad WolanykVlad Wolanyk Backend Engineer

Verify:Runner

Name Role
Nicole WilliamsNicole Williams Senior Engineering Manager, Verify:Runner
Adrien KohlbeckerAdrien Kohlbecker Senior Backend Engineer, Verify:Runner
Arran WalkerArran Walker Senior Backend Engineer, Verify:Runner
Axel von BertoldiAxel von Bertoldi Senior Backend Engineer, Verify:Runner
Cam SwordsCam Swords Staff Backend Engineer, Verify:Runner
Davis BickfordDavis Bickford Backend Engineer, Verify:Runner
Georgi GeorgievGeorgi Georgiev Senior Backend Engineer, Verify:Runner
Hannes HörlHannes Hörl Backend Engineer, Verify:Runner
Joe ShawJoe Shaw Senior Backend Engineer, Verify:Runner
Joe BurnettJoe Burnett Staff Backend Engineer, Verify:Runner
Miguel RinconMiguel Rincon Staff Frontend Engineer, Verify:Runner
Pedro PombeiroPedro Pombeiro Senior Backend Engineer, Verify:Runner
Romuald AtchadéRomuald Atchadé Backend Engineer, Verify:Runner
Tomasz MaczukinTomasz Maczukin Senior Backend Engineer, Verify:Runner

Verify:CI Platform

Name Role
Cheryl LiCheryl Li Senior Manager, Engineering, Verify
Senior Backend EngineerSenior Backend Engineer Senior Backend Engineer, Verify:CI Platform
Marius BobinMarius Bobin Senior Backend Engineer, Verify:CI Platform
Tianwen ChenTianwen Chen Senior Backend Engineer, Verify:CI Platform

Verify Engineering Leaders

Name Role
Cheryl LiCheryl Li Senior Manager, Engineering, Verify
Avielle WolfeAvielle Wolfe Senior Backend Engineer, Verify:Pipeline Authoring
Briley SandlinBriley Sandlin Senior Frontend Engineer, Verify:Pipeline Authoring
Caroline SimpsonCaroline Simpson Fullstack Engineering Manager, Verify:Pipeline Execution
Fabio PitinoFabio Pitino Principal Engineer, Verify
Furkan AyhanFurkan Ayhan Senior Backend Engineer, Verify:Pipeline Authoring
Kasia MisirliKasia Misirli Backend Engineer, Verify:Pipeline Authoring
Laura MontemayorLaura Montemayor Senior Backend Engineer, Verify:Pipeline Authoring
Rajendra KadamRajendra Kadam Senior Backend Engineer, Verify:Pipeline Authoring

Stable Counterparts

Name Role
Adrien KohlbeckerAdrien Kohlbecker Senior Backend Engineer, Verify:Runner
Arran WalkerArran Walker Senior Backend Engineer, Verify:Runner
Axel von BertoldiAxel von Bertoldi Senior Backend Engineer, Verify:Runner
Cheryl LiCheryl Li Senior Manager, Engineering, Verify
Cam SwordsCam Swords Staff Backend Engineer, Verify:Runner
Darren EastmanDarren Eastman Principal Product Manager, Verify:Runner
Davis BickfordDavis Bickford Backend Engineer, Verify:Runner
Dov HershkovitchDov Hershkovitch Senior Product Manager, Verify:Pipeline Authoring
Georgi GeorgievGeorgi Georgiev Senior Backend Engineer, Verify:Runner
Hannes HörlHannes Hörl Backend Engineer, Verify:Runner
Jackie PorterJackie Porter Director of Product Management, Verify & Package
Joe ShawJoe Shaw Senior Backend Engineer, Verify:Runner
Joe BurnettJoe Burnett Staff Backend Engineer, Verify:Runner
Joy RoodnickJoy Roodnick Software Engineer in Test, Test Engineering, Verify:Runner group, Fulfillment section
Marcel AmiraultMarcel Amirault Senior Technical Writer, Verify (Pipeline Execution, Pipeline Authoring)
Miguel RinconMiguel Rincon Staff Frontend Engineer, Verify:Runner
Nicole WilliamsNicole Williams Senior Engineering Manager, Verify:Runner
Pedro PombeiroPedro Pombeiro Senior Backend Engineer, Verify:Runner
Romuald AtchadéRomuald Atchadé Backend Engineer, Verify:Runner
Tiffany ReaTiffany Rea Senior Software Engineer in Test, CI:Verify
Tomasz MaczukinTomasz Maczukin Senior Backend Engineer, Verify:Runner

How we work

Jobs to be done (JTBD)

A Job to be Done (JTBD) is a framework, or lens, for viewing products and solutions in terms of the jobs customers are trying to achieve.

Developer Onboarding in Verify

Welcome to the team! Whether you’re joining GitLab as a new hire, transferring internally, or ramping up on the CI domain knowledge to tackle issues in our area, you’ll be assigned an onboarding/shadowing buddy so you can have someone to work with as you’re getting familiarized with our codebase, our tech stack and general development processes on your Verify team.

Read over this page as a starting point and feel free to set up regular sync or async conversations with your buddy. We recommend setting up weekly touch points, at a minimum, and joining our regular team syncs to learn more about how we work. (Reach out to our Engineering Managers for an invite to those recurring meetings). Please also schedule a few coffee chats to meet some members of our team. You will be assigned a team specific developer onboarding issue (For example, Pipeline Execution Developer onboarding checklist) for you to go through. It contains admin tasks to complete (as a new team member, if relevant), and also links to technical documentation, meeting agendas, and recordings.

Issues labeled with ~onboarding are smaller issues to help you get onboarded into the CI feature area. We typically work Kanban, so if there aren’t any ~onboarding issues in the current milestone, reach out to the Product Manager and/or Engineering Managers to see which issues you can start on as part of your onboarding period.

In May 2021, we introduced the CI Shadow Program, which we are trialing as a way to onboard existing GitLab team members from other Engineering teams to the CI domain and contribute to CI features.

Onboarding Buddies

Onboarding buddies are assigned to new hires to ensure their first few months of onboarding go smoothly. It’s recommended that onboarding buddies set up weekly check-ins, whether that’s async (such as a Slack DM) or sync (such as recurring coffee chats).

Reviewing Merge Requests

In addition to helping those new hire/transfer through any issues with their set up or assigned tasks, it’s recommended that their onboarding buddies add the new hire/transfer as an additional reviewer on any MRs the onboarding buddy has been requested to review. Ideally this takes place after they’ve been working in Verify for at least 3 months, and as mutually agreed upon between both parties. This step further builds the new hire/transfer’s CI knowledge and allows for CI domain expertise to be shared amongst all engineers in Verify.

Similar to our reviewer mentorship programs, the new hire/transfer will review the merge request as if they’re being asked to perform the code review. Once complete, they’ll assign the MR back to their onboarding buddy. It is expected that the onboarding buddy will also complete the code review, then provide the new hire/transfer feedback about their code review. Ideally this takes place at their next check-in, where notes are captured in a shared Google Doc or a GitLab issue for ease-of-collaboration.

API development

Our API exists in two formats (REST and GraphQL) which should allow the same degree of querying. In Verify, we are GraphQL first which means that we will develop new user facing features using GraphQL by default. We will refactor older REST user facing features to support GraphQL wherever possible. In some instances, it might make more sense to keep or even develop a new feature using REST. For example, REST is better at handling files than GraphQL so it might be better to preserve this functionality in REST. We allow each team to decide when they think they should go with REST, but eventually the goal is to have everything in GraphQL.

Shared issues

In the Verify team we lean in to the GitLab mission, “everyone can contribute”! To help balance this workload out across the groups, we use the Verify candidate label. Every issue with this label is a good candidate to be worked on by any group in the Verify stage. This applies to both frontend and backend issues. Prioritization is still determined by the Product Manager, such as ensuring any deliverables in the engineer’s own group still take priority, but engineers are encouraged to pick work up from this board. This helps us break down silos, balance the workload, and prevent disruptive re-allocations.

To help with prioritizing within the list of available Verify candidate issues, it’s recommended to reference the issue types in the Product Priorities list, noting any severities applied on the issues as well.

Shared technical debt

In the Verify stage, prioritizing our technical debt so that we can move faster is a top priority. Starting in August 2024, the Pipeline teams in the Verify stage have created a board for Pipeline teams in the Verify stage that is intended to help us prioritize technical debt and bring alignment across team members of what is the most critical technical debt work to be focusing on. An engineer DRI from each team will work closely with their EM to align on which issues should be advocated for the most at a given time.

Issue Health Status Definitions & Async Issue Updates

Across Verify we value Transparency, we live our values of Inclusion, and we expect Efficiency by dogfooding using Issue Health Statuses and providing regular updates on active issues. Each team in the Verify Stage will define the cadence of updates and specific definition of the statuses, but generally the expectation is a weekly update on in progress issues with the following Health Statuses:

  • On Track
  • Needs Attention
  • At Risk

These updates are an opportunity for the engineer to add detail to the status and are not expected to provide a justification for why something is behind or will miss a milestone. We encourage blameless problem solving and kindness at all times.

Merge Requests in Verify

In the Verify stage, we value MR Rate as a shared performance indicator for team collaboration, iteration, customer results. The entire team is responsible for iterative scope in issues. This starts with product management creating a clear problem statement connected to user insights. UX then adds interaction specifications and acceptance criteria to then be considered and weighed by the engineering team. Teams are encouraged to iterate on scope so as to delivery the smallest thing possible.

By considering MR Rate as a measure of throughput, product management is focused on creating decomposed pieces of scope to improve the user experience. This encourages the UX and engineering teammates to provide simpler ways to solve the same problem, ultimately improving the throughput of the entire team.

Since April 2023, code changes to Verify code require approval from a Verify maintainer since Continuous Integration platform overall is a critical GitLab feature. In order to track quality of the approval process we ask Verify maintainers to apply one of the following labels to a merge request changing Verify code:

  • ~"verify-review::impacted" for merge requests where the maintainer was able to identify near miss bugs, inefficiencies and tech debt.
  • ~"verify-review::not impacted" for merge requests where the change was trivial or no issues were found by the Verify maintainer.

Pipeline Authoring and Pipeline Execution Collaboration

Pipeline Authoring and Pipeline Execution are closely related but they also represent different stages in the cycle of a user’s interaction with a pipeline. At a very high-level, this image illustrates the main focus of each group and how they can both support a better pipeline experience.

Verify Groups

Async Work Week

We have quarterly async work weeks in Verify that start on the first Monday of the quarter.

Some of the noted benefits include reduced time spent in sync meetings, allowing for more focus that aligns with our async-first communication and our Diversity and Inclusion value to bias towards more async communication. However, this doesn’t preclude us from having any meetings; it’s up to the respective meeting attendees to decide accordingly. Exceptions might include: high priority issues and initiatives, social calls, coffee catch-ups. This also does not mean that you should not default to async-first at other times. Having regularly scheduled async weeks ensures that our processes do not become dependent on synchronous meetings.

Verify Engineering - Async Updates

Current (2022 onward)

As of June 2022, async issue updates are created weekly at the stage level and for each of the groups within the stage, following the Ops section process of async updates. Contributions will be added by Principal+ Engineers, Engineering Managers, and the Senior Engineering Manager of the Verify stage.

2020-2021

Every two weeks the Verify Engineering Update Newsletter is set out to an opt-in subscriber list. The purpose of the email is to share recent highlights from the Verify stage so folks will have a better idea of what is happening on other teams, and provide new opportunities for learning and collaboration.

Everyone is welcome to sign up or view previous issue on the newsletter page (link no longer available).

Each issue of the newsletter is planned using individual issues linked in the newsletter epic. Content is generally contributed by managers, but everyone is encouraged to contribute topics for the newsletter.

Verify Technical Discussions

Verify Technical Discussions is a Zoom meeting hosted monthly by the team members in the Verify stage. Everyone is invited, however participation from the Verify stage members is especially encouraged.

During the meeting we discuss a variety of technical aspects related to the Verify stage roadmap. Folks are also encouraged to challenges they’re facing working on problems in the CI domain.

Everyone can add their points to the agenda document.

Below you can find a table with links to recordings of Verify Technical Discussions and Technical Deep Dives.

Current Inventory of Recordings

The Verify Technical Discussions are automatically recorded and added to Google Drive (internal).

Uploaded Recordings to YouTube

Date Title Recording
2021-01-21 Technical Discussions - Pipeline Editor and database storage Recording
2021-01-07 Technical Discussions - Next iteration of CI/CD Pipeline DAG Recording
2020-12-10 Technical Deep Dive - Observability at GitLab with demos Recording
2020-11-19 Technical Deep Dive - Cloud Native Build Logs feature overview Recording
2020-05-08 Technical Deep Dive - Using Prometheus with GitLab Compose Kit Recording

Weekly Triage Reports

The Product Manager, Engineering Manager(s), and Designer for each group are responsible for triaging and scheduling feature proposals and bugs as part of the Weekly Triage Report. Product Managers schedule issues by assigning them to a Milestone or the Backlog. For bug triage, Engineering Managers and Product Managers work together to ensure that the correct severity and priority labels have been applied. Since Product Managers are the DRIs for prioritization, they will validate and/or apply the initial priority label to bugs. Engineering Managers are responsible for adding or updating the severity labels for bugs listed in the triage report, and to help Product Management with understanding the criticality and technical feasibility of the bug, as needed.

While SLOs for resolving bugs are tied to severities, it is up to Product Management to set priorities for the group with an appropriate target resolution time. For example, criteria such as the volume of severity::2 level bugs may make it appropriate for the Product Manager to adjust the priority accordingly to reflect the expected time to resolution.

Availability, security, performance, and bug triage process

In the Verify Stage, we aim to solve new availability, security, performance issues within the SLO of the assigned severity. These types of issues are the top priority followed by bugs and technical debt according to our severity SLO chart.

Availability and performance issues (commonly referred to as infradev) are also triaged in the Infra/Dev Triage Board.

Supporting Community Contributions

We believe in supporting our open source community. We aim to support two main measure of success:

  1. Merged MRs from community contributions
  2. MRARR

Each team in the Verify Stage follows roughly the same process to ensure the community is effectively supported and free to add features or fixes to the product. How we manage the Community Contribution MRs is spread across three main areas: processing the contributions, reviewing the contributions, and merging the contributions.

Process

Code contributions to Verify typically occur in three flavors:

  1. Free users, open source contributions from already scoped issues
  2. Paid users, open source contributions not from scoped issues
  3. Paid users, proprietary contributions not from scoped issues

Contributions from both free and paid users are equally important and will follow our GitLab Contribution Guidelines. We strive to make this process as frictionless as possible between our users and the Engineering teams in Verify, especially during the reviewing of contributions.

Reviewing contributions

Once a contribution has been created, the Engineering Manager assigns an engineer to manage and review with the Community Contributor. Reviewing contributions will follow the definition of done, style guidelines, and other practices of development. As the DRI of the review, the assigned Verify engineer will work with Community Contributor to resolve any outstanding items. The MR is then passed to a Maintainer with relevant domain expertise for final review prior to being merged.

Contributions from Partners

Our partners are an important part of our ecosystem at GitLab. These contributions should be reviewed with the same GitLab Contribution Guidelines as community MR contributions, and aligns with the Verify contribution guidelines for working in the Verify areas of the codebase.

Merging the Contribution

The Maintainer of the codebase will be the DRI of merging the contribution into the Verify product.

SLOs for Community Contributions

For issues that are refined, they are considered priority for review and merging. Refined issues are defined as issues in workflow::ready for development, with direction labels, and either have technical proposals or are weighted. Issues that are not labeled with workflow::ready for development and direction labels are considered non-refined and are lower priority for review, and therefore have longer merge SLOs. SLOs for these two types of issues are defined in the table below:

Types of issues & users Time Frame for Review SLO Time frame for Merge SLO
All users contributing to refined issues or bugs of severity S2 or S1 30 days The next release (60 days)
Paying users from non-refined issues or bugs of severity S3 60 days Within the next 3 releases (approx one quarter or 90 days)
Non-paying users from non-refined issues or bugs of severity S4 90 days Anything outside the next 3 releases (more than one quarter or 120 days).

In order to prevent the inflow of Community Contributions overwhelming engineers on the team and impacting their ability to work on planned issues, there is a WIP limit of 5 assigned Community Contribution MRs per reviewing engineer. This helps limit the amount of context switching a single engineer is doing and prevents them from being overwhelmed with reviews.

Managing urgent priorities

In the event of escalations or high priority tasks that come up, we will adopt a follow-the-sun rotation to ensure that the task is completed as quickly as possible. Examples of these urgent tasks include (but are not limited to): critical customer fixes, high severity security issues or corrective actions related to a high severity incident, with this level of urgency to be confirmed by the team’s leadership (e.g. Engineering Manager and Product Manager). The follow-the-sun rotation requires an engineer to focus on that task within their working hours, then transitions to the next engineer once they come online, and engineers will arrange handoff of work to one another as needed (for example, when collaborating on an MR together as the contributor and reviewer, respectively). This effort can also be a cross-stage effort, which involves collaboration amongst engineers across Verify stage groups.

As part of our GitLab values, we strive to be inclusive to those in regions with fewer employees. As a result, there is no expectation that people are expected to continuously work outside of their regular business hours. In the event we have limited people available in certain regions (such as APAC), we should look to escalate outside of the Verify stage to maintain focus on the escalated effort, by reaching out to engineers or other SMEs, and have managers help with this escalation path as needed.

Slack Channels


Project Plans
Verify:CI Platform Group
The GitLab Verify:CI Platform Group Handbook page
Verify:Pipeline Authoring Group
The GitLab team page for the Pipeline Authoring Group
Verify:Pipeline Execution Group
The GitLab team page for the Pipeline Execution Group.
Verify:Runner
The GitLab Runner team page.