Engineering

Engineering Direction

GitLab has a Three-Year Strategy, and we’re excited to see every member of the Engineering division contribute to achieving it. Whether you’re creating something new or improving something that already exists, we want you to feel empowered to bring your best ideas for influencing the product direction through improved scalability, usability, resilience, and system architectures. And when you feel like you need to expand your knowledge in a particular area, know that you’re supported in having the resources to learn and improve your skills.

Our focus is to make sure that GitLab is enterprise grade in all its abilities and to support the AI efforts required to successfully launch AI features to General Availability.

Making sure that GitLab is enterprise grade involves several teams collaborating on improving our disaster recovery and support offerings through ongoing work with GitLab Dedicated and Cells infrastructure. Our goal here is improved availability and service recovery.

Engineering Culture

Engineering culture at GitLab encompasses the processes, workflows, principles and priorities that all stem from our GitLab Values. All these things continuously strengthen our engineering craftsmanship and allow engineers to achieve engineering excellence, while growing and having a significant, positive impact on the product, people, and the company as a whole. Our engineering culture is primarily being carried and evolves through knowledge sharing and collaboration. Everyone can be part of this process because at GitLab everyone can contribute.

Engineering Excellence

Engineering excellence can be defined as an intrinsic motivation to improve engineering efficiency, software quality, and deliver better results while building software products. Engineering excellence is being fueled by a strong engineering culture combined with a mission: to build better software that allows everyone to contribute.

Engineering Initiatives

Engineering is the primary advocate for the performance, availability, and security of the GitLab project. Product Management prioritizes 60% of engineering time, so everyone in the engineering function should participate in the Product Management prioritization process to ensure that our project stays ahead in these areas. Engineering prioritizes 40% of time on initiatives that improve the product, underlying platform, and foundational technologies we use.

Work in the 40% time budget should be coordinated and prioritized by the Engineering Manager of a team. Use the label Engineering Time for issues and MRs that are done as part of it so we can follow the work and the results across the engineering division.

  • Contributing to broad engineering initiatives and participating in working group-related tasks.
  • Review fixes from our support team. These merge requests are tagged with the Support Team Contributions label. You can filter on open MRs.
  • Working on high priority issues as a result of issue triaging. This is our commitment to the community and we need to include some capacity to review MRs or work on defects raised by the community.
  • Improvements to the performance, stability and scalability of a feature or dependency including underlying infrastructure. Again, the Product team should be involved in the definition of these issues but Engineering may lead here by planning, prioritizing, and coordinating the recommended improvements.
  • Improvements and upgrades to our toolchain in order to boost efficiency.
  • Codebase improvements: Removing technical debt, updating or replacing outdated dependencies, and enhancing logging and monitoring capabilities.
  • Constructing proof-of-concept models for thorough exploration of new technologies, enhancements and new possibilites.
  • Work on improvements and feature enhancements to the product, in the sense of internal community contributions, that would increase our internal engineering productivity by focusing on ready-to-go items that are currently assigned a low priority in the backlog.

Technical Roadmaps

Some of the above examples for the 40% time budget can help in forming a long-term technical roadmap for your group, and determine how best to prioritize your technical work to support overall business goals. In addition to the examples above:

  • Ask yourself these questions
    • What are your most frequent sources of delays? (Could be long-standing tech debt you have to work past while developing, could be lack of reviewers for your domain, could be external to your team like with pipeline duration)
    • Do you have any consistently similar bugs or security issues that come in due to a certain area?
    • Has your team been talking about potentially refactoring any areas?
    • Is your team struggling with certain processes?
    • Have you had recent incidents that allude to a larger problem?
    • Are you getting frequent requests for help in some area?
    • Is your team frequently missing their deliverable commitments? What would help?
    • Does your area have performance (slow endpoints, inconsistent responses, intermittent errors) or scalability (the feature or area as-is will not scale) concerns?
    • Where do you see the biggest instability? Have you talked to operations and support about feedback for your area?
    • Do you have application or rate limits in the right places?
    • Have you burned down your security, corrective action, and infradev issues?
    • Is your error budget green?
    • Have your feature flags been removed from the codebase yet?
    • Do you have adequate unit test, integration test and E2E coverage?
    • Do you have adequate documentation for your features?
    • Do you have adequate telemetry , logging, monitoring of your features?
    • Do you have adequate error handling and error codes that allows fast and easy diagnostics?
  • Gather data like this
    • Master:Broken issues
    • ~“severity::1” and ~“severity::2” bugs
    • Missed-Slo issues
    • Flaky test issues
    • ~“type::maintenance” issues
  • Think about the future state of your product
    • Where do you want your product to be this time next year?
    • What are the technical requirements to achieve that?
    • What are technical topics that would benefit from research/POCs?
    • What would make it easier for you to achieve that if it was no longer a factor?
    • What would be the performance and/or business impact once you address these issues?
    • How would you evolve your team processes to regularly review your technical roadmap?

Technical roadmap process

Engineering Managers (EMs) are responsible for collaboratively developing their team’s technical roadmap backlog. All items should be documented as epics and issues using the “Technical Roadmap” label.

Global initiatives will be defined and must be incorporated into each group’s roadmap and prioritization (e.g., allocating 40% of front-end capacity for Vue upgrade, completing all Cells issues for a specific area by milestone XYZ).

Prioritization of items should align with:

  1. General business goals
  2. Engineering vision
  3. Team capacity and expertise

Planning Guidelines:

  • Allocate 40% of the overall time budget for technical roadmap items in the normal milestone planning process.
  • Use the “Technical roadmap” label for all related issues to facilitate tracking and coordination.

Key Steps:

  1. Identify and document technical debt and improvement opportunities
  2. Assess impact and effort for each item
  3. Prioritize based on business value and strategic alignment
  4. Integrate with existing iteration/milestone planning
  5. Regularly review and adjust the roadmap

This process ensures a balanced approach between feature development and technical improvements, promoting long-term sustainability and efficiency of the engineering organization.

Community Contributions

We have a 3-year goal of reaching 1,000 monthly contributors as a way to mature new stages, add customer-desired features that aren’t on our roadmap, and even translate our product into multiple languages.

Diversity

Diverse teams perform better. They provide a sense of belonging that leads to higher levels of trust, better decision making, and a larger talent pool. They also focus more on facts, process facts more carefully, and are more innovative. By hiring globally and increasing the numbers of women and under represented groups (URGs) in the Engineering division, we’re helping everyone bring their best selves to work.

Growing our team

Strategic hiring is a top priority, and we’re excited to continue hiring people who are passionate about our product and have the skills to make it the best DevSecOps tool in the market. Our current focus areas include reducing the amount of time between offer and start dates and hiring a diverse team (see above). We’re also implementing industry-standard approaches like structured, behavioral, and situational interviewing to help ensure a consistent interview process that helps to identify the best candidate for every role. We’re excited to have a recruiting org to partner with as we balance the time that managers spend recruiting against the time they spend investing in their current team members.

Expand customer focus through depth and stability

As expected, a large part of our focus is on improving our product.

For Enterprise customers, we’re refining our product to meet the levels of security and reliability that customers rightfully demand from SaaS platforms (SaaS Reliability). We’re also providing more robust utilization metrics to help them discover features relevant to their own DevOps transformations (Usage Reporting) and offering the ability to purchase and manage licenses without spending time contacting Sales or Support (E-Commerce and Cloud Licensing). Lastly, in response to Enterprise customer requests, we’re adding features to support Suggested Reviewers, better portfolio management through Work Items, and Audit Events that provide additional visibility into user passive actions.

For Free Users, we’re becoming more efficient with our open core offering, so that we can continue to support and give back to students, startups, educational institutions, open source projects, GitLab contributors, and nonprofits.

For Federal Agencies, we’re obtaining FedRAMP certification to strengthen confidence in the security standards required on our SaaS offering. This is a mandated prerequisite for United States federal agencies to use our product.

For Hosted Customers, we’re supporting feature parity between Self-Managed and GitLab Hosted environments through the Workspace initiative. We’re also launching GitLab Dedicated for customers who want the flexibility of cloud with the security and performance of a single-tenant environment.

For customers using CI/CD, we’re expanding the available types of Runners to include macOS, Linux/Docker, and Windows, and we’re autoscaling build agents.

Engineering Departments

There are five departments within the Engineering Division:

Workflows

GitLab in Production

People Management

Cross-Functional Prioritization

See the Cross-Functional Prioritization page for more information.

SaaS Availability Weekly Standup

To maintain high availability, Engineering runs a weekly SaaS Availability standup to:

  • Review high severity (S1/S2) public facing incidents
  • Review important SaaS metrics
  • Track progress of Corrective Actions
  • Track progress of Feature Change Locks

Infrastructure Items

Each week the Infrastructure team reports on incidents and key metrics. Updating these items at the top of the Engineering Allocation Meeting Agenda is the responsibility of the Engineering Manager for the General Squad in Reliability.

  1. Incident Review
    • Include any S1 incidents that have occurred since the previous meeting.
    • Include any incidents that required a status page update.
  2. SaaS Metrics Review
    1. Include screenshots of the following graphs in the agenda.

Development Items

For the core and expansion development departments, updates on current status of:

  1. Error budgets
  2. Reliability issues (infradev)
  3. Security issues

Groups under Feature Change Locks should update progress synchronously or asynchronously in the weekly agenda. The intention of this meeting is to communicate progress and to evaluate and prioritise escalations from infrastructure.

Feature Change Locks progress reports should appear in the following format in the weekly agenda:

FCL xxxx - [team name]

  • FCL planning issue: <issue link>
  • Incident Issue: <issue link>
  • Incident Review Issue: <issue link>
  • Incident Timeline: <link to Timeline tab of the Incident issue>
    • e.g. time to detection, time to initiate/complete rollback (as applicable), time to mitigation
  • Cause of Incident
  • Mitigation
  • Status of Planned/completed work associated with FCL

Feature Change Locks

A Feature Change Lock (FCL) is a process to improve the reliability and availability of GitLab.com. We will enact an FCL anytime there is an S1 or public-facing (status page) S2 incident on GitLab.com (including the License App, CustomersDot, and Versions) determined to be caused by an engineering department change. The team involved should be determined by the author, their line manager, and that manager’s other direct reports.

If the incident meets the above criteria, then the manager of the team is responsible for:

  • Form the group of engineers working under the FCL. By default, it will be the whole team, but it could be a reduced group if there is not enough work for everyone.
  • Plan and execute the FCL.
  • Inform their manager (e.g. Senior Manager / Director) that the team will focus efforts towards an FCL.
  • Provides updates at the SaaS Availability Weekly Standup.

If the team believes there does not need to be an FCL, approval must be obtained from either the VP of Infrastructure or VP of Development.

Direct reports involved in an active borrow should be included if they were involved in the authorship or review of the change.

The purpose is to foster a sense of ownership and accountability amongst our teams, but this should not challenge our no-blame culture.

Timeline

Rough guidance on timeline is provided here to set expectations and urgency for an FCL. We want to balance moving urgently with doing thoughtful important work to improve reliability. Note that as times shift we can adjust accordingly. The DRI of an FCL should pull in the timeline where possible.

The following bulleted list provides a suggested timeline starting from incident to completion of the FCL. “Business day x” in this case refers to the x business day after the incident.

  • Day 0: Incident:
  • Business day 1: relevant Engineering Director collaborates with VP of Development and/or VP of Infrastructure or their designee to establish if FCL is required.
  • Business day 2: confirmation that an FCL is required for this incident and start planning.
  • Business days 3-4: planning time
  • Business days 5-9 (1 week): complete planned work
  • Business days 10-11: closing ceremony, retrospective and report back to standup

Activities

During the FCL, the team(s) exclusive focus is around reliability work, and any feature type of work in-flight has to be paused or re-assigned. Maintainer duties can still be done during this period and should keep other teams moving forward. Explicitly higher priority work such as security and data loss prevention should continue as well. The team(s) must:

  • Create a public slack channel called #fcl-incident-[number], with members
    • The Team’s Manager
    • The Author and their teammates
    • The Product Manager, the stage’s Product leader, and the section’s Product leader
    • All reviewer(s)
    • All maintainers(s)
    • Infrastructure Stable counterpart
    • The chain-of-command from the manager to the VP (Sr Manager, Sr/Director, VP, etc)
  • Create an FCL issue in the FCL Project with the information below in the description:
    • Name the issue: [Group Name] FCL for Incident ####
    • Links to the incident, original change, and slack channel
    • FCL Timeline
    • List of work items
  • Complete the written Incident Review documentation within the Incident Issue as the first priority after the incident is resolved. The Incident Review must include completing all fields in the Incident Review section of the incident issue (see incident issue template). The incident issue should serve as the single source of truth for this information, unless a linked confidential issue is required. Completing it should create a common understanding of the problem space and set a shared direction for the work that needs to be completed.
  • See that not only all procedures were followed but also how improvements to procedures could have prevented it
  • A work plan referencing all the Issues, Epics, and/or involved MRs must be created and used to identify the scope of work for the FCL. The work plan itself should be an Issue or Epic.
  • Daily - add an update comment in your FCL issue or epic using the template:
    • Exec-level summary
      • Target End Date
      • Highlights/lowlights
  • Add an agenda item in the SaaS Availability weekly standup and summarize status each week that the FCL remains open.
  • Hold a synchronous closing ceremony upon completing the FCL to review the retrospectives and celebrate the learnings.
    • All FCL stakeholders and participants shall attend or participate async. Managers of the groups participating in the FCL, including Sr. EMs and Directors should be invited.
    • Agenda includes reviewing FCL retrospective notes and sharing learnings about improving code change quality and reducing risk of availability.
    • Outcome includes handbook and GitLab Docs updates where applicable.
Scope of work during FCL

After the Incident Review is completed, the team(s) focus is on preventing similar problems from recurring and improving detection. This should include, but is not limited to:

  • Address immediate corrective actions to prevent incident reoccurrence in the short term
  • Introduce changes to reduce incident detection time (improve collected metrics, service level monitoring, which users are impacted)
  • Introduce changes to reduce mitigation time (improve rollout process through feature flags, and clean rollbacks)
  • Ensure that the incident is reproducible in environments outside of production (Detect issues in staging, increase end-to-end integration test coverage)
  • Improve development test coverage to detect problems (Harden unit testing, make it simpler to detect problems during reviews)
  • Create issues with general process improvements or asks for other teams

Examples of this work include, but are not limited to:

  • Fixing items from the Incident Review which are identified as causal or contributing to the incident.
  • Improving observability
  • Improving unit test coverage
  • Adding integration tests
  • Improving service level monitoring
  • Improving symmetry of pre-production environments
  • Improving the GitLab Performance Tool
  • Adding mock data to tests or environments
  • Making process improvements
  • Populating their backlog with further reliability work
  • Security work
  • Improve communication and workflows with other teams or counterparts

Any work for the specific team kicked off during this period must be completed, even if it takes longer than the duration of the FCL. Any work directly related to the incident should be kicked off and completed even if the FCL is over. Work paused due to the FCL should be the priority to resume after the FCL is over. Items created for other teams or on a global level don’t affect the end of the FCL.

A stable counterpart from Infrastructure will be available to review and consult on the work plan for Development Department FCLs. Infrastructure FCLs will be evaluated by an Infrastructure Director.

Engineering Performance Indicator process

The Product Analytics team is responsible for maintaining Engineering Performance Indicators. Work regarding KPI / RPI is tracked using the Product Analytics task intake tracker.

Manual verification

We manually verify that our code works as expected. Automated test coverage is essential, but manual verification provides a higher level of confidence that features behave as intended and bugs are fixed.

We manually verify issues when they are in the workflow::verification state. Generally, after you have manually verified something, you can close the associated issue. See the Product Development Flow to learn more about this issue state.

We manually verify in the staging environment whenever possible. In certain cases we may need to manually verify in the production environment.

If you need to test features that are built for GitLab Ultimate then you can get added to the issue-reproduce group on production and staging environments by asking in the #development Slack channel. These groups are on an Ultimate plan.

Critical Customer Escalations

We follow the below process when existing critical customer escalations requires immediate scheduling of bug fixes or development effort.

Requirements for critical escalation

  • Customer is in critical escalation state
  • The issues escalated have critical business impact to the customer, determined by Customer Success and Support Engineering leadership
    • Failure to expedite scheduling may have cascading business impact to GitLab
  • Approval from a VP from Customer Success AND a Director of Support Engineering are required to expedite scheduling

Process

  • The issue priority is set to ~"priority::1" regardless of severity
  • The label ~"critical-customer-escalation" is applied to the issue
  • The issue is scheduled within 1 business day
    • For issues of type features, approval from the Product DRI is needed.
  • The DRI or their delegate provides daily process updates in the escalated customer slack channel

DRI

  • If issue is type bug DRI is the Director of Development
  • If issue is type feature DRI is the Director of Product
  • If issue requires Infrastructure work the DRI is the Engineering Manager in Infrastructure

The DRI can use the customer critical merge requests process to expedite code review & merge.

Pairing Engineers on priority::1/severity::1 Issues

In most cases, a single engineer and maintainer review are adequate to handle a priority::1/severity::1 issue. However, some issues are highly difficult or complicated. Engineers should treat these issues with a high sense of urgency. For a complicated priority::1/severity::1 issue, multiple engineers should be assigned based on the level of complexity. The issue description should include the team member and their responsibilities.

Team Member Responsibility
Team Member 1 Reproduce the Problem
Team Member 2 Audit Code Base for other places where this may occur

If we have cases where three or five or X people are needed, Engineering Managers should feel the freedom to execute on a plan quickly.

Following this procedure will:

  • Decrease the time it takes to resolve priority::1/severity::1 issues
  • Allow for a smooth handover of the issue in case of OOO or End of the Work Day
  • Provide support for Engineers if they are stuck on a problem
  • Provide another set of eyes on topics with high urgency or securing security-related fixes

Internal Engineering handbook

There are some engineering handbook topics that are internal only. These topics can be viewed by GitLab team members in the engineering section of the internal handbook.


Architecture

Complexity at Scale

As GitLab grows, through the introduction of new features and improvements on existing ones, so does its complexity. This effect is compounded by the care and feeding of a single codebase that supports the wide variety of environments in which it runs, from small self-managed instances to large installations such as GitLab.com. The company itself adds to this complexity from an organizational perspective: hundreds employees worldwide contribute in one way or another to both the product and the company, using GitLab.com on a daily basis to do their job. Teams members in Engineering are directly responsible for the codebase and its operation, for the infrastructure powering GitLab.com, and for the support of customers running self-managed instances. Likewise, team members in the Product organization chart the future of the product.

Core Development Department

Vision

Our goal is not merely to launch features, but to ensure they land successfully and provide real value to our customers. We strive to develop a best-in-class product that exceeds expectations across all user groups by meeting high-quality standards while ensuring reliability and maintaining an ease of operation and scalability to meet diverse customer needs. All team members should remain mindful of our target customers and the multiple platforms we support in everything we do.

Cross Functional Prioritization

Overview

The Cross-Functional Prioritization framework exists to give everyone a voice within the product development quad (PM, Development, Quality, and UX). By doing this, we are able to achieve and maintain an optimal balance of new features, security fixes, availability work, performance improvements, bug fixes, technical debt, etc. while providing transparency into prioritization and work status to internal and external stakeholders so they can advocate for their work items. Through this framework, team members will be able to drive conversations about what’s best for their quad and ensure there is alignment within each milestone.

CTO Leadership Team

The CTO Leadership Team is composed of the CTO’s direct reports and the Office of the CTO (OCTO).

Office of the CTO (OCTO)

The OCTO is composed of the CTO, the Engineering EBAs, the CTO’s People Business Partners, and the CTO’s Director of Strategy and Operations. This team works to amplify the CTO’s reach, vision, and mission. They work together to deliver programs and results across the entire Engineering Division.

Deployments and Releases

Overview and terminology

This page describes the deployment and release approach used to deliver changes to users. The overall process consists of two significant parts:

  1. Monthly self-managed release: GitLab version (XX.YY.0) published every month. From this monthly release, planned patches are scheduled twice a month and unplanned critical patches are created as needed.
  2. GitLab.com deployment: A Continous Delivery process to deploy branches created from master branch, on regular intervals.

For more details on the individual processes and how to use them please see the Deployments page for GitLab.com changes and the Releases page for changes for self-managed users.

Developer Onboarding
Awesome! You're about to become a GitLab developer! Here you'll find everything you need to start developing.
Development
Engineering Career Development

The Three Components of Career Development

There are three important components of developing one’s career:

Structure

Team members who are (or want to be) on track for promotion should be engaged in a career coaching conversation with their manager. Some basic information about this process can be found in the People Ops handbook. Specific coaching plan templates are listed here to help start the conversation:

We want to build these documents around the career matrix for Engineering. Since this career matrix is still being developed, these documents are currently based on the job family requirements.

Engineering Communication

Communication

GitLab Engineering values clear, concise, transparent, asynchronous, and frequent communication. Here are our most important modes of communication:

As part of a fully-distributed organization such as GitLab, it is important to stay informed about engineering-led initiatives. We employ multimodal communication, which describes the minimum set of communication channels we’ll broadcast to.

The Engineering Divison has a Google Group, engineering@gitlab.com (internal only), that all members of the division should become members as part of the onboarding process. If this is not the case for you, reach out to your manager. As GitLab, the company, primarily communicates via Slack, use this list mainly for Access Control to Google Drive/Docs/Sheets/Slides.

Engineering Demo Process

Occasionally, it is useful to set up a demo on a regular cadence to ensure cross-functional iterative alignment. This is helpful for high-impact deliverables that require integration across multiple functional teams. This is in-line with the seventh principle of the Agile Manifesto: “Working Software is the best measure of progress”.

Demo script

For multi-person groups or critical projects, we use a heavier weight grading process:

  1. The demo owner identifies the outcome of the demo based on the business criteria. This can be an engineering manager, a product manager or someone who is a business stakeholder of the outcome.
  2. The demo owner breaks down the outcome into smaller pieces, aligning with functional areas (tracks) and structured in procedural flow. This will later be captured as demo steps.
    • List each step, however small it might look to expose implicit dependencies.
  3. The demo owner identifies a functional team leader as a DRI for each demo track. The DRI for each track is responsible for demoing each track to completion.
  4. The demo owner collaborates with functional team leaders to populate the demo steps in a scorecard. Here is the demo scorecard template. To use this template:
    • Copy the template and rename to the initative/deliverable.
    • Clear the scores in the scorecard sheet.
    • Populate the demo tracks and demo steps.
    • Note: Here is an example of a populated demo scorecard.
  5. The demo owner identifies a demo grader to hold grading accountability. This can be the demo owner or someone who is familiar with the product domain and customers’ usecase. It is important that the demo grader is someone who can advocate for the success of our end users.

Demo scheduling

  1. Once the script is finalized, the demo owner schedules a recurring recorded meeting for the demo with target end date.
  2. Demo owner & demo grader must be present in every demo to ensure accoutablility. Assign delegates appropriately for one-off un-avaliability.
  3. Create an agenda document where each participant can take notes in, in addition to the scorecard.
  4. The audience is the key business stakeholder of the demo deliverables & the product group team (Development, UX, Quality, Product).
  5. Meeting should be kept to 30 minutes. The emphasis should be on the product requirements & acceptance criteria.
  6. The demo gets kicked off and each demo tracks iterate each week on the progress until completion.
  7. Live streaming or uploading to GitLab Unfiltered channel is optional. Please abide by our SAFE guidelines if choosen to do so.

Demo grading

The demo master grades each step during the demo meeting. To make it less subjective, we use a scale that is widely understood and communicated. Our scoring definitions are as follows:

Engineering Error Budgets
The error budget provides a clear, objective metric that determines how unreliable the service is allowed to be within a single quarter.
Engineering Fellow Shadow
GitLab engineers: work with an Engineering Fellow for a week
Engineering Function Performance Indicators

Executive Summary

KPI Health Status
Engineering Handbook MR Rate Okay
  • Above target
Engineering Team Member Retention Okay
  • above target, constant trend
Engineering Vacancy Time to Fill Attention
  • Trending up
  • Need to coach hiring managers to lean in while recruiting rebuilds

Key Performance Indicators

Engineering Handbook MR Rate

The handbook is essential to working remote successfully, to keeping up our transparency, and to recruiting successfully. Our processes are constantly evolving and we need a way to make sure the handbook is being updated at a regular cadence. This is measured by Merge Requests that update the handbook contents relate to the Engineering Division overtime.

Engineering Hiring

Overview

Hiring is a cornerstone of success for our engineering organization, contributing to our growth and our ability to drive results for our customers. As such, it’s not just a responsibility but fundamental to every engineer’s contribution to GitLab. It should be deeply ingrained in every engineer’s role at GitLab, regardless of their seniority.

By actively participating in recruitment efforts, engineers help shape their team culture, elevate technical standards, and ensure a continuous influx of diverse perspectives and skillsets. Contributing to hiring efforts allows GitLab to grow responsibly and affects our collective success within Engineering.

Engineering IC Leadership

Engineering IC Leadership at GitLab: going beyond Senior level

At GitLab, it is expected that everyone is a manager of one. For Individual Contributors (IC) a new type of challenge begins with the Staff Engineer role. Engineering IC Leadership is an alternative career path to Engineering Management.

Just like moving into management, also moving from Senior to Staff changes the day-to-day work and expectations placed on ICs.

Engineering IC Leaders exert technical leverage in their scope of influence. Like any other leadership role, the focus should be on helping others to improve. Their impact multiplies with every person they help grow, and the company gets more value when they’re not investing time in doing things themselves.

Engineering Management

How Engineering Management Works at GitLab

At GitLab, we promote two paths for leadership in Engineering. While there is a healthy degree of overlap between these two ideas, it is helpful and efficient for us to specialize training and responsibility for each of:

While technical leadership tends to come naturally to software engineers, professional leadership can be more difficult to master.

Engineering Mentorship

Mentorship, Coaching and Engineering Programs

Line Managers and Senior Individual Contributors

The PlatoHQ Program has a total of 10 Engineering Managers/Senior IC’s participating. The program exists of both self-learning via an online portal and 1-1 sessions with a mentor.

Senior Leaders in Engineering

The 7CTOs Program is run with 4 Senior leaders in Engineering. The program exists of peer mentoring sessions (forums) and effective network building.

Engineering Projects
Name Location
AI Gateway gitlab-org/modelops/applied-ml/code-suggestions/ai-assist
AI Gateway Helm Chart gitlab-org/charts/ai-gateway-helm-chart
Analytics Helm Charts gitlab-org/analytics-section/product-analytics/helm-charts
Analytics Manager gitlab-org/analytics-section/analytics-manager
Analytics Stack gitlab-org/analytics-section/product-analytics/analytics-stack
Auto Build Docker Image cluster-integration/auto-build-image
Auto Deploy Docker Image cluster-integration/auto-deploy-image
Autoscaler Custom Executor driver for GitLab Runner gitlab-org/ci-cd/custom-executor-drivers/autoscaler
Buyer Experience gitlab-com/marketing/digital-experience/buyer-experience
GitLab Helm Repository charts/charts.gitlab.io
Chef configuration management gitlab-com/gl-infra/chef-repo
Cloud Deploy gitlab-org/cloud-deploy
Managed Cluster Applications Docker Image cluster-integration/cluster-applications
Cloud Native GitLab containers gitlab-org/build/CNG
Configuration Management gitlab-com/gl-infra/config-mgmt
Container Registry gitlab-org/container-registry
Cookbook Omnibus GitLab gitlab-org/cookbook-omnibus-gitlab
CustomersDot (Subscription Portal) gitlab-org/customers-gitlab-com
Data Infrastructure gitlab-data/data-image
GitLab Data Chatops gitlab-data/chatops
Declarative Policy gitlab-org/ruby/gems/declarative-policy
Pajamas Design System gitlab-org/gitlab-services/design.gitlab.com
Dev On-Call tool gitlab-com/dev-on-call
Devkit gitlab-org/analytics-section/product-analytics/devkit
discussion-automation gitlab-org/secure/pocs/discussion-automation
Distribution team issue tracker gitlab-org/distribution/team-tasks
dri gitlab-org/ruby/gems/dri
duo-ui gitlab-org/duo-ui
Duo Workflow Executor gitlab-org/duo-workflow/duo-workflow-executor
Duo Workflow Service gitlab-org/duo-workflow/duo-workflow-service
Engineering productivity infrastructure gitlab-org/quality/engineering-productivity-infrastructure
Dedicated Environment Automation gitlab-com/gl-infra/gitlab-dedicated
Fargate gitlab-org/ci-cd/custom-executor-drivers/fargate
Fleeting gitlab-org/fleeting/fleeting
Fleeting Plugin AWS gitlab-org/fleeting/fleeting-plugin-aws
Fleeting Plugin Azure gitlab-org/fleeting/fleeting-plugin-azure
Fleeting Plugin Google Compute gitlab-org/fleeting/fleeting-plugin-googlecompute
Fleeting Plugin Static gitlab-org/fleeting/fleeting-plugin-static
Git gitlab-org/git
Gitaly gitlab-org/gitaly
GitHost.io gitlab-com/githost
GitLab gitlab-org/gitlab
Gitlab Advanced SAST gitlab-org/security-products/analyzers/gitlab-advanced-sast
GitLab Agent for Kubernetes cluster-integration/gitlab-agent
Helm chart for GitLab Agent for Kubernetes charts/gitlab-agent
GitLab Agent for Kubernetes CI image cluster-integration/gitlab-agent-ci-image
gitlab-build-images gitlab-org/gitlab-build-images
GitLab Cloud Native Helm Chart charts/gitlab
gitlab-cli gitlab-org/cli
Gitlab Cloud Connector gitlab-org/cloud-connector/gitlab-cloud-connector
GitLab.com COGS gitlab-cog
GitLab.com - infrastructure Terraform files gitlab-com/gitlab-com-infrastructure
Gitlab.com - runbooks gitlab-com/runbooks
GitLab Components components
GitLab Compose Kit gitlab-org/gitlab-compose-kit
GitLab Contributors gitlab-com/gitlab-contributors
GitLab.com - infrastructure node provisioning by role gitlab-cookbooks
GitLab Dangerfiles gitlab-org/ruby/gems/gitlab-dangerfiles
Data Utils gitlab-data/gitlab-data-utils
GitLab Design gitlab-org/gitlab-design
GitLab Development Kit gitlab-org/gitlab-development-kit
GitLab Docs gitlab-org/gitlab-docs
GitLab Elasticsearch Indexer gitlab-org/gitlab-elasticsearch-indexer
GitLab Environment Toolkit gitlab-org/gitlab-environment-toolkit
gitlab-eslint-config gitlab-org/gitlab-eslint-config
GitLab Experiment gitlab-org/ruby/gems/gitlab-experiment
GitLab Exporter gitlab-org/gitlab-exporter
GitLab Figma Plugin gitlab-org/gitlab-figma-plugin
GitLab FOSS gitlab-org/gitlab-foss
GitLab GLFM Markdown gitlab-org/ruby/gems/gitlab-glfm-markdown
gitlab-gollum-lib gitlab-org/gollum-lib
GitLab Ingress NGINX gitlab-org/cloud-native/charts/gitlab-ingress-nginx
GitLab Duo Plugin for JetBrains gitlab-org/editor-extensions/gitlab-jetbrains-plugin/
GitLab License gitlab/gitlab-license
GitLab Language Server gitlab-org/editor-extensions/gitlab-lsp/
GitLab MailRoom gitlab-org/ruby/gems/gitlab-mail_room
GitLab Markup gitlab-org/gitlab-markup
GitLab Observability backend gitlab-org/opstrace/opstrace
GitLab Observability UI gitlab-org/opstrace/opstrace-ui
GitLab Omnibus Builder gitlab-org/gitlab-omnibus-builder
GitLab Operator gitlab-org/cloud-native/gitlab-operator
GitLab Operator v2 gitlab-org/cloud-native/operator
GitLab Orchestrator gitlab-org/gitlab-orchestrator
GitLab Pages gitlab-org/gitlab-pages
GitLab Performance Tool gitlab-org/quality/performance
GitLab QA gitlab-org/gitlab-qa
Backup/restore procedures gitlab-restore
GitLab Roulette gitlab-org/gitlab-roulette
GitLab RSpec Profiling Statistics gitlab-org/rspec_profiling_stats
GitLab Runner gitlab-org/gitlab-runner
GitLab Shell gitlab-org/gitlab-shell
GitLab Sketch Plugin gitlab-org/gitlab-sketch-plugin
GitLab Styles gitlab-org/ruby/gems/gitlab-styles
GitLab SVGs gitlab-org/gitlab-svgs
techtask gl-technical-interviews/backend/template/techtask
GitLab Triage gitlab-org/ruby/gems/gitlab-triage
gitlab-ui gitlab-org/gitlab-ui
gitlab-vim gitlab-org/editor-extensions/gitlab.vim
GitLab Extension for Visual Studio gitlab-org/editor-extensions/gitlab-visual-studio-extension/
GitLab VS Code extension gitlab-org/gitlab-vscode-extension
gitlab-web-ide gitlab-org/gitlab-web-ide
gitlab-web-ide-vscode-fork gitlab-org/gitlab-web-ide-vscode-fork
GitLab Workhorse gitlab-org/gitlab-workhorse
GitLab Workspaces Proxy gitlab-org/remote-development/gitlab-workspaces-proxy
GitLab Zoekt Helm Chart gitlab-org/cloud-native/charts/gitlab-zoekt
GitLab Zoekt Indexer gitlab-org/gitlab-zoekt-indexer
ZOQL gitlab-org/ruby/gems/gitlab-zoql
GitLab Data gitlab-data/analytics
GitLab Kramdown gitlab-org/gitlab_kramdown
GitLab Quality Test Tooling gitlab-org/ruby/gems/gitlab_quality-test_tooling
CIS GitLab Benchmark Scanner gitlab-org/govern/compliance/engineering/cis/gitlabcis
gitlabktl gitlab-org/gitlabktl
Gitter webapp gitlab-org/gitter/webapp
GLQL Rust gitlab-org/gitlab-query-language/glql-rust
Go MimeDB gitlab-org/go-mimedb
grape-path-helpers gitlab-org/grape-path-helpers
Graphql Ruby gitlab-org/ruby/gems/graphql-ruby
GitLab Runner Infrastructure Toolkit gitlab-org/ci-cd/runner-tools/grit
Group Conversations gitlab-org/group-conversations
gRPC gitlab-org/ruby/gems/grpc
helm-install-image gitlab-org/cluster-integration/helm-install-image
HTTP Router gitlab-org/cells/http-router
Iglu gitlab-org/iglu
Infrastructure Management gitlab-com/gl-infra/infra-mgmt
GitLab.com - infrastructure issue tracker gitlab-com/infrastructure
internal-handbook internal-handbook/internal.gitlab.com
Distribution issue bot gitlab-org/distribution/issue-bot
k3s GitLab CI cluster-integration/test-utils/k3s-gitlab-ci
k8s-agent-qa gitlab-org/configure/k8s-agent-qa
Kubernetes workloads GitLab.com gitlab-com/gl-infra/k8s-workloads/gitlab-com/
Kubernetes workloads Helm release configuration gitlab-com/gl-infra/k8s-workloads/gitlab-helmfiles
Kubernetes deployments using Tanka gitlab-com/gl-infra/k8s-workloads/tanka-deployments
Kubernetes GitLab Demo gitlab-org/kubernetes-gitlab-demo
labkit gitlab-org/labkit
labkit-ruby gitlab-org/labkit-ruby
LicenseFinder gitlab-org/ruby/gems/LicenseFinder
Marketing Operations gitlab.com/gitlab-com/marketing/marketing-operations
Marketo Tools gitlab-com/marketo-tools
Nesting gitlab-org/fleeting/nesting
Omnibus GitLab gitlab-org/omnibus-gitlab
OS images for MacOS build cloud gitlab-org/ci-cd/shared-runners/images/macstadium/orka
Package Hunter gitlab-org/security-products/package-hunter
Package Hunter CLI gitlab-org/security-products/package-hunter-cli
pages.gitlab.io pages/pages.gitlab.io
Compensation Calculator gitlab.com/gitlab-com/people-group/peopleops-eng/compensation-calculator/
PeopleOps Employment Automation gitlab-people-engineering/employment-automation
Nominator gitlab.com/gitlab-com/people-group/peopleops-eng/nominatorbot
Splinter PTO gitlab.com/gitlab-com/people-group/peopleops-eng/splinter
pipeline-validation-service gitlab-org/modelops/anti-abuse/pipeline-validation-service
GitLab Project Templates gitlab-org/project-templates
prometheus-client-mmap gitlab-org/prometheus-client-mmap
Google Protobuf gitlab-org/ruby/gems/protobuf
Public Container Image Archive Registry gitlab-org/public-image-archive
Common CI for QA pipelines gitlab.com/gitlab-org/quality/pipeline-common
Quality SSH Tunnel gitlab.com/gitlab-org/quality/ssh-tunnel
IpynbDiff /gitlab-org/incubation-engineering/mlops/rb-ipynbdiff
Release CLI gitlab-org/release-cli
GitLab Release Tools gitlab-org/release-tools
GitLab Runner Releaser gitlab-org/ci-cd/runner-tools/releaser
GitLab Runner Releases gitlab-org/ci-cd/runner-tools/releases
RemoteOnly.org gitlab-com/www-remoteonly-org
Renovate GitLab Bot gitlab-org/frontend/renovate-gitlab-bot
Repository X-Ray gitlab-org/create-stage/code-creation/repository-x-ray
Rouge rouge-ruby/rouge
GitLab Runner Helm Chart gitlab-org/charts/gitlab-runner
Runner Incept gitlab-org/ci-cd/tests/runner-incept
GitLab Runner Operator for Kubernetes gitlab-org/gl-openshift/gitlab-runner-operator
GitLab Runner UBI Offline Build gitlab-org/ci-cd/gitlab-runner-ubi-images
Runway Provisioner gitlab-com/gl-infra/platform/runway/provisioner
Runway Reconciler gitlab-com/gl-infra/platform/runway/runwayctl
SAST Scanner Service gitlab.com/gitlab-org/secure/sast-scanner-service
Sec Section Dangerbot gitlab-org/security-products/danger-bot
sectypes gitlab-org/security-products/analyzers/sectypes
Dependency Scanning - Gemnasium analyzer gitlab-org/security-products/analyzers/gemnasium
Dependency Scanning - Gemnasium Database gitlab-org/security-products/gemnasium-db
Static Analysis - Bandit analyzer gitlab-org/security-products/analyzers/bandit
Static Analysis - Brakeman analyzer gitlab-org/security-products/analyzers/brakeman
Code Quality gitlab-org/ci-cd/codequality
Static Analysis - Eslint analyzer gitlab-org/security-products/analyzers/eslint
Static Analysis - Flawfinder analyzer gitlab-org/security-products/analyzers/flawfinder
Static Analysis - Gosec analyzer gitlab-org/security-products/analyzers/gosec
Static Analysis - KICS analyzer gitlab-org/security-products/analyzers/kics
Static Analysis - Kubesec analyzer gitlab-org/security-products/analyzers/kubesec
Static Analysis - MobSF analyzer gitlab-org/security-products/analyzers/mobsf
Static Analysis - Nodejs-Scan analyzer gitlab-org/security-products/analyzers/nodejs-scan
Static Analysis - PHPCS-Security-Audit analyzer gitlab-org/security-products/analyzers/phpcs-security-audit
Static Analysis - PMD Apex analyzer gitlab-org/security-products/analyzers/pmd-apex
Static Analysis - Semgrep Rules gitlab-org/security-products/sast-rules
Static Analysis - Security-Code-Scan analyzer gitlab-org/security-products/analyzers/security-code-scan
Static Analysis - Semgrep analyzer gitlab-org/security-products/analyzers/semgrep
Static Analysis - Sobelow analyzer gitlab-org/security-products/analyzers/sobelow
Static Analysis - Spotbugs analyzer gitlab-org/security-products/analyzers/spotbugs
Static Analysis - Tracking analyzer gitlab-org/security-products/post-analyzers/tracking-calculator
Secret Detection - Secrets analyzer gitlab-org/security-products/analyzers/secrets
Secure Analyzers gitlab-org/security-products/analyzers
API Security gitlab-org/security-products/analyzers/api-fuzzing-src
Browser-based DAST Engine gitlab-org/security-products/analyzers/browserker
Secure Command analyzer gitlab-org/security-products/analyzers/command
Secure Report analyzer gitlab-org/security-products/analyzers/report
Static Analysis - Ruleset analyzer gitlab-org/security-products/analyzers/ruleset
Container Scanning gitlab-org/security-products/analyzers/container-scanning
Coverage Fuzzer gitlab-org/security-products/analyzers/gitlab-cov-fuzz-src
Dynamic Application Security Testing (DAST) gitlab-org/security-products/dast
Dependency Scanning gitlab-org/security-products/analyzers/dependency-scanning
License Database - Advisory Processor gitlab-org/security-products/license-db/advisory-processor
License Database - Deployment gitlab-org/security-products/license-db/deployment
License Database - License Exporter gitlab-org/security-products/license-db/license-exporter
License Database - License Feeder gitlab-org/security-products/license-db/license-feeder
License Database - License Interfacer gitlab-org/security-products/license-db/license-interfacer
License Database - License Processor gitlab-org/security-products/license-db/license-processor
License Database - Schema gitlab-org/security-products/license-db/schema
Static Application Security Testing (SAST) gitlab-org/security-products/sast
Operational Container Scanning - Trivy K8S Wrapper gitlab-org/security-products/analyzers/trivy-k8s-wrapper
Security Report Schemas gitlab-org/security-products/security-report-schemas
Security Report Schemas Ruby gitlab-org/security-products/security-report-schemas-ruby
Semver Dialects gitlab-org/ruby/gems/semver_dialects
GitLab Sidekiq Reliable Fetcher gitlab-org/sidekiq-reliable-fetch
Siphon gitlab-org/analytics-section/siphon
Snowflake Spend dbt Package gitlab-data/snowflake_spend
Snowplow Micro Configuration gitlab-org/snowplow-micro-configuration
GitLab Status Page gitlab-org/status-page
Step Runner gitlab-org/step-runner
Switchboard gitlab-com/gl-infra/gitlab-dedicated/switchboard
takeoff gitlab-org/takeoff
Tamland gitlab-com/gl-infra/tamland
Tanuki Emoji gitlab-org/ruby/gems/tanuki_emoji
Tanukidesk gitlab-com/marketing/developer-relations/community-advocacy/tanukidesk
Taskscaler gitlab-org/fleeting/taskscaler
Images for using Terraform in GitLab CI gitlab-org/terraform-images
GitLab internal terraform modules gitlab-com/gl-infra/terraform-modules
GitLab Terraform Provider gitlab-org/terraform-provider-gitlab
test_file_finder gitlab-org/ruby/gems/test_file_finder
Topology Service gitlab-org/cells/topology-service
Topology Service Deployer gitlab-org/cells/topology-service-deployer
GitLab triage operations gitlab-org/quality/triage-ops
GitLab University gitlab-org/university
Accessibility gitlab-org/ci-cd/accessibility
version.gitlab.com gitlab-org/gitlab-services/version.gitlab.com
www-gitlab-com gitlab-com/www-gitlab-com

AI Gateway

AI Gateway for GitLab Duo features.

Engineering Secondments
Learn about GitLab's secondment program for external engineers.
Engineering Team Readmes
Engineering Workflow
This document explains the workflow for anyone working with issues in GitLab Inc.
Expansion Development Department

Vision

Scale and develop our diverse, global team to drive results that support our product and customer growth, while maintaining our values and unique way of working.

Mission

GitLab’s unique way of working asynchronously, handbook first, using the product we develop, and with clear focus on our values enables very high productivity. In delivering on growth, we maintain our values and ways of working while developing team members and increasing the diversity of our team. We focus on constantly improving usability and reliability of our product to reach maximum customer satisfaction. Community contributions and customer interactions rely on efficient and effective communication. We are a data-driven, customer experience first, open core organization delivering one secure, reliable, world leading DevOps platform.

Fast Boot

A Fast Boot is an event that gathers the members of a team or group in one physical location to work together and bond in order to accelerate the formation of the team or group so that they reach maximum productivity as early as possible.

History of the Fast Boot

  • The first Fast Boot took place in December 2018. The 13 members of Monitor Group gathered for 3 days to work and bond in Berlin. You can learn more by reading the original planning issue.
  • The second Fast Boot took place in April 2019. The 5 members of Delivery team gathered in Utrecht to bond but also work on finalising auto-deployment process. Planning issue contains the proposal for Fast boot, and the Delivery Fast boot epic contains issues and links to recordings created during the Fast Boot.
  • The third Fast Boot took place in Vancouver in September 2019. It included 18 people from Product, Engineering, UX and Data from the Acquisition, Conversion, Expansion and Retention teams. The planning issue contains the proposal for Fast Boot, and outcomes are available in the Growth Fast Boot Page.

Why should you have a Fast Boot?

Right now, the fast boot is intended for new teams or for teams with a majority of new members who need to build their culture of shipping work. If your team fits this description, you can propose holding a Fast Boot to reduce ramp up time and establish and strengthen relationships between team members.

Frontend Group

Teams

Frontend domain experts

You can find engineers with expertise in various frontend domains on the engineering projects page under the following sections:

You can reach out to these experts to get help on:

  • discussing and defining the architecture of complex frontend features.
  • frontend technical topics like Vue, GraphQL, CSS, testing, tooling, etc.
  • proposing changes to the cross-domain frontend architecture via an RFC.
  • Questions about the frontend for a product area like design management, merge requests, pipelines, etc.

Frontend group calls

The frontend group has scheduled weekly calls every Tuesday. Since 2021-06-01, these occur at three staggered, time zone friendly times, repeating every three weeks. During these calls, team members are encouraged to share information that may be relevant to share with other members synchronously (Eg. new documentation change, new breaking changes added to master).

FY25 Engineering get-together subsidy

Background

As part of the FY25-Q2 Engagement Survey Results & Action Planning, we identified Team Member Development & Engagement as being an area to focus on. One of the actions we took was to identify a way to provide Engineering get-togethers for increased sense of belonging

After looking at different possibilities based on budget we were able to provide a subsidy in FY25 to facilitate these get-togethers, both in an in-person format as well as virtually.

GitLab Plato HQ Mentoring Program

Program Overview

GitLab has partnered with Plato HQ for an external Mentoring Program. In this program GitLab team members select Mentors external to GitLab. Some of the other Mentoring programs we have here at GitLab are internal to GitLab. Minorities in Tech and Women in Sales are both made up of GitLab Mentors and GitLab Mentees. The external mentoring is what makes this approach to GitLab unique.

For more information on mentoring best practice, visit Mentoring.

GitLab Repositories

GitLab consists of many subprojects. A curated list of GitLab projects can be found at the GitLab Engineering projects page.

Creating a new project

When creating a new project, please follow these steps:

  1. Read and familiarize yourself with our stance on Dogfooding. Be aware that as part of a product development organization that builds a tool for people like us, that our default is to add features and tooling to the GitLab project. This is still true when the effort to do so is 2-5x. Despite this, if you still feel you need to create a project outside of GitLab, you must follow this process to document the decision

Guidelines for automation and access tokens
Guidelines for automation with project/group tokens or service accounts
Incident

Definition of an Incident

The definition of “incident” can vary widely among companies and industries. Here at GitLab, incidents are anomalous conditions that result in — or may lead to — service degradation, outages, or other disruptions. These events require human intervention to avert disruptions, communicate status, restore normal service, and identify future improvements. Incidents are always given immediate attention.

Incident Management

Incident Management is the process of responding to, mitigating, and documenting an incident. At GitLab, we approach Incident Management as a feedback loop with the following steps, with different teams adjusting them as needed:

Infrastructure
The Infrastructure Department is responsible for the availability, reliability, performance, and scalability of GitLab.com and other supporting services
Infrastructure and Quality department

Vision

Our vision is to be a world-class Infrastructure & Tools department that enables GitLab to meet & exceed our customers’ needs.

We:

  1. Build critical infrastructure, metrics & tools that enable GitLab Engineering & Product teams to do their best work efficiently and ship high-quality & reliable products to our customers.
  2. Are customer focused. We have an ambitious drive to attain high availability & reliability for SaaS platforms and self-managed customers.
  3. Provide and maintain best practice tools and methodologies that create a platform for engineering teams to do their work productively.
  4. Enable GitLab Engineering & Product teams to run services effectively using our tools, to meet business needs & SLOs.

Direction

Direction is set within the Infrastructure, and the Quality direction pages. With the ongoing consolidation of the departments, separate direction pages will become obsolete.

Infrastructure Platforms
The Infrastructure Platforms department is responsible for the availability, reliability, performance, and scalability of GitLab.com and other supporting services
Joint R&D OKR Process

R&D OKR Overview

This page provides an overview of the joint R&D OKR workflow. All departments within R&D, which includes the Product and Engineering Divisions, collaborate by following this guidance. For clarifications on the OKR process, team members can post in Slack #product or #engineering-fyi.

Timeline and process for OKRs

The OKR process is designed to tie in to the overall OKR process the company uses. That process is driven largely off of the date of the Key Review meetings, so the Product process keys off of that date as well. Dates will not necessarily align with the start of a fiscal quarter as a result.

Monitoring of GitLab.com

GitLab.com Service Availability

The calculation methodology for GitLab.com Service Availability definition is in the monitoring policy.

More details on definitions of outage, and degradation are on the incident-management page

Historical Service Availability

Year Month Availability Comments
2024 September 99.85%
2024 August 100.00%
2024 July 99.99%
2024 June 99.99%
2024 May 100.00%
2024 April 99.96%
2024 March 100%
2024 February 99.86%
2024 January 100%
2023 December 99.99%
2023 November 99.99%
2023 October 99.89 Oct 30 Sev 1
2023 September 99.98%
2023 August 100%
2023 July 99.78% Two severity 1 incidents contributed to ~94% of service disruption. 2023-07-07, 2023-07-14
2023 June 100%
2023 May 99.92%
2023 April 99.98%
2023 March 99.99%
2023 February 99.98%
2023 January 99.80%
2022 December 100%
2022 November 99.86%
2022 October 100%
2022 September 99.98%
2022 August 99.92%
2022 July 99.95%
2022 June 99.96%
2022 May 99.99%
2022 April 99.98%
2022 March 99.91%
2022 February 99.87%
2022 January 99.95%
2021 December 99.96%
2021 November 99.71%
2021 October 99.98%
2021 September 99.85%
2021 August 99.86%
2021 July 99.78%
2021 June 99.84%
2021 May 99.85% does not include manual adjustment for PostgreSQL 12 Upgrade
2021 April 99.98%
2021 March 99.34%
2021 February 99.87%
2021 January 99.88%
2020 December 99.96%
2020 November 99.90%
2020 October 99.74%
2020 September 99.95%
2020 August 99.87%
2020 July 99.81%
2020 June 99.56%
2020 May 99.58%

These videos provide examples of how to quickly identify failures, defects, and problems related to servers, networks, databases, security, and performance.

On-Call

Expectations for On-Call

  • If you are on call, then you are expected to be available and ready to respond to PagerDuty pages as soon as possible, and within any response times set by our Service Level Agreements in the case of Customer Emergencies. If you have plans outside of your workspace during your on-call shift, this may require that you bring a laptop and reliable internet connection with you.
  • We take on-call seriously. There are escalation policies in place so that if a first responder does not respond in time, another team member is alerted. Such policies are not expected to be triggered under normal operations, and are intended to cover extreme and unforeseeable circumstances.
  • Because GitLab is an asynchronous workflow company, @mentions of On-Call individuals in Slack will be treated like normal messages, and no SLA for response will be associated with them.
  • Provide support to the release managers in the release process.
  • As noted in the main handbook, after being on-call, make sure that you take time off. Being available for issues and outages can be taxing, even if you had no pages. Resting after your on-call shift is critical for preventing burnout. Be sure to inform your team of the time you plan to take for time off.
  • During on-call duties, it is the team member’s responsibility to act in compliance with local rules and regulations. If ever in doubt, please reach out to your manager and/or aligned People Business Partner.

Customer Emergency On-Call Rotation

  • We do 7 days of 8-hour shifts in a follow-the-sun style, based on your location.
  • After 10 minutes, if the alert has not been acknowledged, support management is alerted. After a further 5 minutes, everyone on the customer on-call rotation is alerted.
  • All tickets that are raised as emergencies will receive the emergency SLA. The on-call engineer’s first action will be to determine if the situation qualifies as an emergency and work with the customer to find the best path forward.
  • After 30 minutes, if the customer has not responded to our initial contact with them, let them know that the emergency ticket will be closed and that you are opening a normal priority ticket on their behalf. Also let them know that they are welcome to open a new emergency ticket if necessary.
  • You can view the schedule and the escalation policy on PagerDuty. You can also opt to subscribe to your on-call schedule, which is updated daily.
  • After each shift, if there was an alert / incident, the on call person will send a hand off email to the next on call explaining what happened and what’s ongoing, pointing at the right issues with the progress.
  • If you need to reach the current on-call engineer and they’re not accessible on Slack (e.g. it’s a weekend, or the end of a shift), you can manually trigger a PagerDuty incident to get their attention, selecting Customer Support as the Impacted Service and assigning it to the relevant Support Engineer.
  • See the GitLab Support On-Call Guide for a more comprehensive guide to handling customer emergencies.

GitLab.com Reliability On-Call Rotation

Infrastructure Engineer On-Call

The Infrastructure department’s SREs provide 24x7 on-call coverage for the production environment. For details, please see incident-management.

Open Source at GitLab

We believe in Open Source

As a company, GitLab is dedicated to open source. Not only do we believe in it, but we use it, and we give back to it. Not just through GitLab, but through contributions to other open source projects.

The purpose of this page is to document how a GitLab employee can:

  • Create an open source project on behalf of GitLab
  • Contribute to a third-party open source project on behalf of GitLab
  • Use a third-party open source code in a GitLab’s project

Growth Strategy

As an open source project, we want to stay healthy and be open for growth, but also ready to accommodate a 10x factor of our community. In order to achieve that, we’ve outlined a strategy that is a collaboration between multiple departments.

Performance

Performance Facets

We categorize performance into 3 facets

  1. Backend
  2. Frontend
  3. Infrastructure

Backend performance

Backend performance is scoped to response time of API, Controllers and command line interfaces (e.g. git).

DRI: Tim Zallman, VP of Engineering, Core Development.

Performance Indicators:

Frontend performance

Frontend performance is scoped to response time of the visible pages and UI components of GitLab.

DRI: Tim Zallman, VP of Engineering, Core Development

Policies related to GitLab.com

The handbook pages nested under “policies” directory are controlled documents, and follow a specific set of requirements to satisfy various regulatory obligations.

Avoid nesting non-controlled documentation at this location.

Quality Department
The Quality Department in Engineering Division
R&D Tax Credits

GitLab submits applications for R&D Tax Credits in a number of jurisdictions that implement reimbursement schemes for research and development. A subject-matter expert (SME) from engineering is appointed to each application to assist with data collection. A third-party tax agent prepares and submits the report. SMEs are usually Engineering Managers or Directors and located in, or with reasonable knowledge of, the jurisdiction under application.

Role of the SME

The role of the SME is twofold:

Recognition in Engineering

Engineering Quarterly Achievers

Quarterly, CTO Leadership will recognize Engineering team members who have excelled in a given quarter. Recognition includes:

  • an invitation to the Engineering Quarterly Achievers Chat
  • participation in the Engineering Quarterly Achiever’s Recognition Dinner - an expensed meal for yourself, friends and family to celebrate your work, the meal must occur before the last day of the quarter following the announcement. Winners each quarter have until the last day of the quarter to submit for reimbursement. Winners may submit their receipt for the meal for reimbursement via Navan. Please see the instructions below.

In Navan, click Add Transaction Select Upload receipt (or select Type in details) Under Expense Type field, please select “Team events & meals” Under Classification, please select “FY25 Team Building” Under Description field, please include this link: https://handbook.gitlab.com/handbook/engineering/recognition/#engineering-quarterly-achievers-recognition-dinner Click Submit (or Save & close if you need to come back to add more information).

Releases

Overview and terminology

This page describes the processes used to release packages to self-managed users.

Monthly self-managed release

GitLab version (XX.YY.0) is published every month. From this monthly release, planned, and unplanned critical patch releases are created as needed.

Our maintenance policy describes in detail the cadence of our major, minor and patch releases for self-managed users. The major release yearly cadence was defined after an all stakeholder discussion.

Self-managed overview

The self-managed release is a semver versioned package containing changes from many successful deployments on GitLab.com. Users on GitLab.com, therefore, receive features and bug fixes earlier than users of self-managed installations.

Root Cause Analysis

At GitLab transparency is one of our core values, as it helps create an open and honest working environment and service, which in turn accelerates growth and innovation. We treat a root cause analysis (RCA) as an opportunity to be transparent amongst our organization and community by investigating what went well and what didn’t after working on a project, incident, or issue. This page defines an RCA, the benefits of completing them, and how to complete a successful RCA here at GitLab.

Starting new teams

Starting new teams

Our product offering is growing rapidly. Occasionally we start new teams. Backend teams should map to our product categories. Backend teams also map 1:1 to product managers.

A dedicated team needs certain skills and a minimum size to be successful. But that doesn’t block us from taking on new work. This is how we iterate our team size and structure as a feature set grows:

  1. Existing Team: The existing PM schedules issues for most appropriate existing engineering team
    • If there is a second PM for this new feature, they work through the first PM to preserve the 1:1 interface
  2. Shared Manager Team: Dedicated engineer(s) are identified on existing teams and given a specialty
    • The manager must do double-duty
    • Their title can reflect both specialties of their engineers e.g. Engineering Manager, Distribution & Package
    • Even if temporary, managing two teams is a valuable career opportunity for a manager looking to develop director-level skills
    • Each specialty can have its own process, for example: Capitalized team label, Planning meetings, Standups
  3. New Dedicated Team:
    • Engineering Manager
    • Senior/Staff Engineer
    • Two approved fulltime vacancies
    • A dedicated PM

Team Construction

Generally engineering teams at GitLab are fullstack, they are made up of Frontend, Backend, and Fullstack individual contributors with a single Engineering Manager.

Unplanned Upgrade Stop Workflow

An unplanned upgrade stop is disruptive for customers as it requires to perform rollback and additional maintenance work for performing the upgrade. Unplanned stops should be treated as incidents. The process below outlines the different stages of the incident resolution process and the steps to be taken by the corresponding teams and Directly Responsible Individuals (DRIs).

High-level workflow:

  1. Detect unplanned upgrade stop: Identify instances of unplanned upgrade stops.
  2. Resolve upgrade bug: Backport the fix or update Upgrade path to include new stop.
  3. Perform Unplanned Upgrade Stop Root Cause Analysis: Understand why the stop occurred and prevent future incidents.

What is unplanned upgrade stop?

An unplanned upgrade stop happens when we fail to communicate the necessity of this upgrade stop in our upgrade path. For more information, read what an unplanned upgrade stop is.

Volunteer Coaches for URGs

Pilot Program Overview

This program allows team members at GitLab to volunteer and donate their time and technical skills (such as programming or Linux administration) to provide knowledge, support, and coaching to members of underrepresented groups (URGs) in the technology industry. The hope is we can help people who have been denied opportunity for whatever reason, and desire to get their first job in the technology industry.

This program is in pilot as of November 1, 2020. Please reach out to the contacts below if you are interested in taking part.

Last modified October 29, 2024: Fix broken links (455376ee)