Data Team - How We Work

GitLab Data Team Workflow

Quick Links

Practical guide to contributing to the Data Team Projects

Looking to get hands-on with data at GitLab? Check out our Practical Guide designed for all team members.

This guide complements our existing resources, but with a more practical focus on the step-by-step process of contributing to the Data Team’s projects.

How We Work

We’re happy to help you achieve your goals with Data. As a central shared service with finite time and capacity and with a responsibility to operate and develop the company’s central Enterprise Data, the Data Team must focus its time and energy on initiatives that will yield the greatest positive impact to the overall global organization towards improving customer results.

Work Categorization and Prioritization

The Data Team strives to spend the majority of its time developing and operating the Enterprise Data Platform and related systems, keeping fresh data flowing through the system, regularly expanding the breadth of data available for analysis, and delivering high-impact strategic projects. We categorize our work using the framework outlined below.

Work Category	Description	Prioritization Method
Production Maintenance	Activities required to maintain efficient and reliable data services, including triage, bug fixes, and patching to meet established Service Level Objectives.	Based on severity and impact
Data Team OKRs	The Data Team identifies strategic-level OKRs in collaboration with partner teams each quarter.	Prioritized through the Data Steering Committee and committed to during our quarterly planning process
Business Operations	Data work and business engagement that is requested on an ad-hoc basis throughout the quarter. Includes foundational work to mature the Enterprise Data Ecosystem.	Prioritized on an ongoing basis and committed to during our iteration planning process.

The allocation of capacity across Production Maintenance, Data Team OKRs, and Business Operations will be determined by each pillar on a quarter-by-quarter basis in the planning file. The target allocation varies by data team pillar and needs of the business. The allocation will consider the amount of Production Maintenance support needed from the pillar, strategic initiatives that the pillar needs to support, and business operations projects that require support. Within each Work Category, issues are prioritized independently and can use the scoped Priority Label with priorities 1, 2, or 3 as options.

We use scoped labels in GitLab to track and prioritize our issues across these work categories.

In addition to the above work categories focused on operating and developing the Enterprise Data Platform and related systems, the Enterprise Data Team spends time each quarter on learning and experimentation. This time is used to learn new skills, experiment with new technologies, and improve the data program. These learning and experimentation issues are prioritized between the team member and their manager while considering Individual Growth Plans and ways to improve the data program.

Project Intake

Here’s the process to follow to create a new Data issue:

Open a New Issue in the Data Team Analytics Project.
Choose a Template from the table below that best matches the work you are requesting.
Complete the templated Description as completely as possible.
Leave the Assignees blank. The Data Team will process your request as a part of our Daily Triage.

Backlog Management and Prioritization

The Data Team backlog consists of all issues that the Data team could work on, but have not yet started development work on. This includes issues in the workflow::2 - waiting for prioritization, workflow::3 - refinement, and workflow::4 - ready to develop stages. These issues have been validated during triage and determined to warrant development.

Backlog Definition

What constitutes the backlog:

This includes issues in the workflow::2 - waiting for prioritization, workflow::3 - refinement, and workflow::4 - ready to develop stages
Issues that have been triaged and scoped with sufficient detail
Work that has been determined to provide business value and warrants development resources

What is not part of the backlog:

Issues still in workflow::1 - triage & validation (not yet validated)
Issues in workflow::5 - development or beyond (development work has already started)
Issues marked as won’t-fix or closed during triage

Prioritization Framework

Strategic Projects (Data Team OKRs)

Characteristics:

Large-scale initiatives aligned with company objectives
Require significant cross-functional coordination
Typically span (multiple) quarter(s) or have substantial scope
Should be labeled as Data Team OKRs

Scheduling Process:

Scheduled quarterly through Data Steering Committee collaboration
Reviewed and prioritized during quarterly planning cycles
Require Opportunity Canvas documentation for evaluation
Subject to Data Steering Committee review and approval

Business Operations Projects

Characteristics:

Smaller, tactical improvements supporting day-to-day operations
Can typically be completed within shorter timeframes (weeks to months)
Support specific business partner needs or operational efficiency
May be reactive to urgent business requirements

Scheduling Process:

Scheduled on an ongoing basis by Data Team Members.
- Teams may use the quarter:: label, which indicates team-prioritized work for the current quarter. When available for new issues, team members should pick from quarterly-labeled issues first. This label is not a final commitment and issues may have this label added/removed throughout the quarter as new information/priorities arise.
Coordinated with relevant Business Partners when appropriate
Can be prioritized and picked up as team member capacity allows
Should align with team member expertise and current workload

Data Team Member Empowerment

Data Team Members are empowered to make scheduling decisions independently for Business Operations projects, following GitLab’s Manager of One principle. Team members should:

Assess their current availability and workload - Review ongoing commitments and capacity
Evaluate business impact and urgency - Consider both immediate needs and strategic value
Make autonomous decisions on what work to prioritize and pick up next
Communicate decisions transparently - Update issue assignments and stakeholders promptly
Consult with their manager when needed - Seek guidance for complex decisions or when additional support is required

Dynamic Prioritization and Workload Management

Priorities can shift based on changing business needs, urgent issues, or new strategic initiatives. Our approach to backlog management recognizes this reality and empowers team members to adapt dynamically while maintaining transparency and accountability.

When higher priority work emerges (either by identifying by themselves, by Business Partners or their manager), team members should pause ongoing initiatives in favor of more critical items. However, the approach to priority changes differs based on the type of work:

“Just Do It Now” Approach:**

For tasks that are simple and direct enough to be completed immediately with little disruption to other work or need guidance from other team members, team members are empowered to use the “just do it now” approach. This builds on our empowerment principles and provides a streamlined way to address straightforward issues, helping to reduce backlog buildup. Examples include quick documentation updates, simple configuration changes, or minor bug fixes.

KRs and Strategic Projects: For KRs, which are generally larger multi-quarter spanning work, priority changes should require broad alignment. KRs typically require contributions from multiple teams, and if one team member gets pulled into higher priority work, it must be discussed based on the overall impact to the KR and other team members’ work. Before pausing or deprioritizing KR-related work, team members should consult with their manager and relevant stakeholders to assess the broader implications.

Business Operations Projects: Prioritization flexibility works well for business operations initiatives. When higher priority work emerges, team members can pause ongoing Business Operations projects, move existing issues back to the backlog (workflow::2 - waiting for prioritization), inform stakeholders of the change and provide updated timelines, then unassign yourself from paused work and update labels appropriately.

This approach ensures that stakeholders always have visibility into what’s happening and why, even when priorities shift, while recognizing the different coordination requirements for different types of work.

Team members should avoid carrying too many simultaneous initiatives, but what constitutes “too many” depends on several factors.

Individual team member experience and capacity
Complexity and type of work
Current business priorities and deadlines
External dependencies and collaboration requirements

Rather than imposing rigid rules, we trust team members to use their judgment and consult with their manager when uncertain about their capacity. This approach aligns with GitLab’s Manager of One principle.

Assignment Meaning and Expectations

When a Data Team Member assigns themselves to an issue, it indicates:

Ownership of the issue through to completion
Responsibility for proactive communication and regular updates
Commitment to deliver within reasonable and communicated timeframes
Authority to make technical decisions related to the work

Backlog Visibility and Management

Labeling and Tracking

Proper labeling and assignment of issues serves as the foundation for our backlog visibility and team coordination.

The current backlog size becomes immediately visible through all issues marked with workflow::2 - waiting for prioritization, workflow::3 - refinement, and workflow::4 - ready to develop, including a size estimate. This gives us a clear picture of validated work that’s ready to be picked up, helping both team members identify their next priorities and managers understand the volume of committed work ahead. Similarly, active work is represented by issues assigned to team members and progressing through development stages, providing insight into what’s currently being delivered.

Labeling helps identifying potential bottlenecks before they become problems. When we can see team members with excessive concurrent assignments, it becomes possible to redistribute work, provide additional support, or help prioritize competing demands. This visibility is crucial for maintaining sustainable work practices and preventing burnout.

Monitoring and Metrics

Team leads and managers should regularly review:

Backlog size and aging of issues
Team member workload distribution
Time spent on strategic vs. operational work
Stakeholder satisfaction with response times

New Issue

Request Type	Issue Template To Choose
General Data Team Support	`[Request] Create Standard Data Team Issue` (note: this is the default, most commonly used template for P3 / tactical requests and Data Team support. If you are not sure where to start, start here)
OKR-Level Project Request	`[Request] Create Opportunity Canvas` (note: this template is required for large-scale, P2 work like OKRs to be reviewed and prioritized in the Data Steering Committee)
Add Data Source	`[Request] New Data Source`
Data Quality Issue	`[Report] Data Quality Problem`

Data Steering Committee

The Data Steering Committee includes representation from partner teams across GitLab (Marketing, Sales, Customer Success, Finance, IT, Support, Product, Engineering, People, Security, Legal) and is used to oversee and drive the strategic direction of GitLab data management and analytics initiatives, including project prioritization. The forum ensures that data is leveraged effectively to support business goals, improve decision-making processes, and drive innovation. It acts as a governing body to establish policies, standards, and best practices for data governance, data quality, data privacy, and data security.

In order for OKRs / projects to be prioritized through the Data Steering Committee, an opportunity canvas is required. An opportunity canvas is a specific issue template that contains detailed information about the work that is being requested, the expected business impact from that work, a rough estimate of the level of effort to accomplish the work, and known risks/dependencies. The opportunity canvas also includes a business value score based on our Value Calculator, which is one factor in prioritizing and ranking our backlog of work.

In the instance that a new project is raised mid-quarter and is proposed to be prioritized sooner than the next quarterly planning cycle, the forum will review the Opportunity Canvas for the new project, determine if the business value and impact warrants re-prioritization, and will come to a decision on the necessary trade-offs (i.e. to prioritize that new work, other planned work must be deprioritized).

Request to Expedite Responses

Requests to expedite responses, triage issues, or MR reviews are rare. Given the Data Team’s shared-service model, expediting one task necessarily de-prioritizes other work. To request an expedited response:

Confirm there is a valid reason for moving your request ahead of others.
Post a note to #data, along with a link to your Issue and a reason why you need an expedited response. Please do not DM an individual on the Data Team directly.
A member of the Data Team will respond within 1 business day.

Deciding What And How To Build

Not all data solutions require the same level of quality, scalability, and performance so we have defined a Data Development framework to help match required outcomes with level of investment. The Data Team works with all teams to build solutions appropriate to the need, but focuses on Trusted Data using Trusted Data Development.

Design Spike

Experimentation is a great approach to performing Explorational Data Development. Oftentimes, there can be multiple solutions to solve a business problem and we need a process to efficiently evaluate the different approaches and ultimately select a solution to promote to a Trusted Data solution. We use Design Spikes to facilitate experimenting. Design Spikes are particularly useful when the proposed solutions result in breaking changes or significant changes to the overall design, structure, and computing of the data tech stack.

The below steps should be followed when performing a Design Spike:

Calculate Value: Establish the Value the data solution can provide GitLab. Value can be measured in a variety of ways, ranging from efficiency, to increased Sales, to reduced compute.
Define Requirements: Create a Requirements document for the Design Spike to define the business and technical requirements the data solution must meet to be successful. Indicate whether each requirement is a Must Have or Nice to Have.
Experimentation Design: This is the most complex part of the Design Spike. We need to decide how to test the data solutions versus defined requirements. Oftentimes, successfully testing data solutions requires simulating production workloads. We need to define the minimal viable sample size of data to include in the design to achieve experimentation results that would be representative of the whole data set in scope for the solution.
Perform the experiment: This involves putting “fingers on the keyboard” and standing up a data solution in the Explorational data development environment. It also involves collecting the results of the experiment pursuant to what has been defined in the Define Requirements phase.
Assessment: After the experimentation is complete, we should perform an analysis of how well each data solution performed against the requirements from the Define Requirements phase and see how well each data solution performed against the Must Have or Nice to Have requirements. We should put this analysis into a document along with a recommended solution based on the results of the Design Spike. We can then submit this to the respective DRIs and Stakeholders on the project for review, feedback, and final decisions on how we will proceed with the use case.

Documentation

The Data Team, like the rest of GitLab, works hard to document as much as possible. We believe this framework for types of documentation from Divio is quite valuable. For the most part, what’s captured in the handbook are tutorials, how-to guides, and explanations, while reference documentation lives within in the primary analytics project. We have aspirations to tag our documentation with the appropriate function as well as clearly articulate the assumed audiences for each piece of documentation.

Documentation guideline

There are several types of documentation we use to capture the topics. Noted the criteria when to use which type of the documentation

(Public) Handbook - Items related to the operational model we used in the company and in the team, all explanations about tools, technologies and processes and why we are doing. Data that is publicly shareable, and does not expose GitLab or its customers to any harm or material impact (Green data, as per Data classification).
Internal handbook - Items which explain the same category as the public handbook, with the difference that the internal handbook contains internal information
Readme.md file - Specific information related to the code where the README.md file resides, which explains how to use that code. If more explanation is needed, a good practice is to either use and/or link to a Handbook article.
Runbooks - Context which explains how to solve the issue in production or how to sort out other problems. The vital thing is to understand that runbook is a guideline of problem-solving approach
WIKI - Items that require regular updates, similar to epics but they span for longer periods of time, e.g. overall Data Team engagement in the quote-to-cash projects.

Matrix with the explanation when to use which documentation type:

Example	Appropriate documentation type
GitLab Duo explanation	Handbook
Python/dbt/Snowflake guideline	Handbook
Description of the package inventory	Handbook
Explanation about the new pipeline/project	Internal Handbook
Data classification description	Internal Handbook
dbt Data lineage diagram	Internal Handbook
Technical explanation of how to run the project	README.md
Basic context about the project from the technical perspective	README.md
Solution of how to fix the Triage issue	Runbooks
Exploration article (ie. Design spike) of how to pseudonimize the data	Runbooks
Review current progress or effort on QtC projects	WIKI

Data Team Value Calculator

The Value Calculator provides a uniform and transparent mechanism for ranking and enables all work to be evaluated on equal terms. The value calculator approach is similar to the RICE Scoring Model for Product Managers and the Demand Metric Prioritization Model for Marketing.

The calculator below is based on the following Value Calculator spreadsheet. Please select the values below to define the value of new work.

Quarterly and Iteration Planning

Our planning process is called the Planning Drumbeat and it encompasses Quarterly Planning and Iteration Planning. The Planning Drumbeat is one of the most important activities the Data Team performs because it helps us align our work with the broader company, while remaining agile enough to manage shifting business priorities.

Quarterly KR Status Reporting

Beginning in FY25-Q1, the Data Team is using the GitLab Objectives and Key Results project to manage quarterly commitments.

Process:

Create KRs in the project for each of the committed Key Results.
In the KR description, add a link to the corresponding Epic from the GitLab Data Team project where the development work is being tracked.
Throughout the quarter, the DRI for the workstream should make the following updates (at a minimum, these updates should be added at the end of each month in the quarter for the KRs that have the Division:: scoped label applied to them; some teams may choose to make updates more frequently):
- Add a comment outlining what work has been completed, and what work is remaining to complete the KR.
- Update the % complete field on the KR.
- Update the Health Status field to indicate whether the KR is On Track, Needs Attention, or At Risk.

DataPulse

DataPulse is a monthly rhythm for team members to provide status updates on Key Results and Business Operations initiatives. The goal is for everyone to stay informed about project statuses, identify new opportunities, and maintain alignment across our various workstreams.

Our process will have three key components:

1. Async written updates about the status of the Key Result and Business Operations

Monthly updates, captured in the Key Result or Business Operations issues and epics (this part we are already doing and will not change):

Status of project work
Challenges and proposed solutions

2. Asynchronous Video Updates (Due 1 week before Monthly Q&A)

For Key Result and Business Operations DRIs Each DRI (and team) will record a brief, maximum 5-minute, video update covering:

Intro/context about the business problem that we addressing in the Key Result or Business Operations
Status update
Challenges and proposed solutions
Business stakeholder feedback and engagement
New opportunities identified

Within your KR and Business Operations teams, please develop a plan for these monthly updates. As the DRI, you’re accountable for aligning on this approach with your project team.

Recording Guidelines:

These videos don’t need to be perfect - aim for a conversational tone.
Consider using time in an existing meeting with the KR team to create this recording. No slides / materials are required.
Include mentions of team members who are part of your work.
Upload your video to our Data Team YouTube playlist one week before our scheduled Q&A session. Include the month and year in the title of the recording and add a link to the video in the Key Result or Business Operations epic/issue.
This is your chance to share your work with team members who you don’t collaborate with on a regular basis - take advantage of it!

3. Monthly Q&A Session (6am PST / 2pm PST options)

We’ll meet for a focused Q&A session to discuss the updates shared in the videos. This allows us to:

Dive deeper into specific areas of interest
Address questions that arise from the video updates
Collaborate on resolving challenges
Align on next steps

Introducing a new data source

Introducing a new data source requires a heavy lift of understanding that new data source, mapping field names to logic, documenting those, and understanding what issues are being delivered. Usually introducing a new data source is coupled with replicating an existing dashboard from the other data source. This helps verify that numbers are accurate and the original data source and the data team’s analysis are using the same definitions.

Incidents

Incidents are anomalous conditions that result in—or may lead to—service degradation, data quality issues, or system outages that require immediate human intervention to prevent disruptions or restore operational status. For data systems, this includes:

Data unavailability or inaccessibility
Data inaccuracy or corruption
Data freshness violations (stale/outdated data)
Unauthorized data exposure or leakage

Incident Criteria: Not all technical failures constitute incidents. The key criteria are:

Whether there is an SLO breach per SLOs standards
Whether there is immediate impact on downstream models, business operations or data consumers
Whether immediate action is needed to resolve the incident

The following events typically qualify as incidents; others may also apply:

Infrastructure outages: Snowflake, Tableau, or other critical platform unavailability
Source system failures: External data sources stopped or corrupted
Access issues: Multiple Users unable to can login into our Data Platform systems to reach essential
Source freshness failures SLO breaches
dbt model failures When (downstream) data models are impacted
dbt test failures Indicating data quality issues
Incorrect data Report shows incorrect data

Incident Criteria. Not all technical failures constitute incidents: The key criteria are:

Whether there is immediate impact on data, downstream models, business operations or data consumers
Whether there is an SLO breach per SLOs standards
Whether immediate action is needed to resolve the incident

For detailed incident definitions, severity levels, and creation procedures, see the Data Team Incident Management handbook page.

Process for working through incidents

Open an Incident issue using the “Incident Report” template. Alternatively, you can convert an existing issue into an incident if it meets the incident criteria. (Note: this can be done by adding a new line and typing /incident in the respective issue).
- Note: Incidents related to the Data Program are always created in the Analytics project to maintain centralized incident tracking and management.
Detail the relevant information with appropriate timestamps and severity level.
Tag and assign people on the Data Team and any other teams that need to be informed.
For infrastructure issues requiring SRE collaboration, also use incident.io while maintaining a corresponding GitLab incident.

Data Team Incidents can be reviewed in Incident Overview page within the Analytics project.

Workflow Summary

Stage (Label)	Responsible	An Item Is Added to This Stage When	Criteria to Progress to Next Stage
`workflow::1 - triage & validation`	Data Triager	A new request has been created	A clear problem statement & business value statement and expected outcome are included in the issue, and appropriate labels (Work Category, Champion, and Team) have been applied. The issue has a numerical weight applied by the Data Triager. If the work does not warrant development, a description for why the work won’t be done will be added, and the issue will be closed.
`workflow::2 - waiting for prioritization`	Data	The issue is scoped, sized and warrants development.	Work is prioritized and implementation timelines are agreed upon.
`workflow::3 - refinement`	Data, Business DRI	The issue is actively being refined	The technical solution have enough detail and clarity that (another) developer (other than the one doing the refinement) would be able to pick it up.
`workflow::4 - ready to develop`	Data	The issue is fully scoped & refined	Work is picked up for development
`workflow::5 - development`	Data	Development work has started	Item is actively being worked on.
`workflow::6 - review`	Data, Business DRI	Development work is ready for, or currently, being reviewed	All work is completed.
`workflow::X - blocked`	Data, Business DRI	Issue needs intervention that assignee can’t perform	Work is no longer blocked

Generally issues should move through this process linearly. Some templated issues will skip from triage to scheduling or scheduled.

Issue Pointing

Issue pointing captures the complexity of an issue, not the time it takes to complete an issue. That is why pointing is independent of who the issue assignee is.

Refer to the table below for point values and what they represent.
We size and point issues as a group.
Effective pointing requires more fleshed out issues, but that requirement shouldn’t keep people from creating issues.
When pointing work that happens outside of the Data Team projects, add points to the issue in the relevant Data Team project and ensure issues are cross-linked.

Weight	Description
Null	Meta and Discussions that don’t result in an MR
0	Should not be used.
1	The simplest possible change including documentation changes. We are confident there will be no side effects.
2	A simple change (minimal code changes), where we understand all of the requirements.
3	A typical change, with understood requirements but some complicating factors
5	A more complex change. Requirements are probably understood or there might be dependencies outside the data-team.
8	A complex change, that will involve much of the codebase or will require lots of input from others to determine the requirements.
13	It’s unlikely we would commit to this in an iteration, and the preference would be to further clarify requirements and/or break into smaller Issues.

Issue Labeling

Think of each of these groups of labels as ways of bucketing the work done.

All issues should get the following classes of labels assigned to them:

Team: The Data Team responsible for performing the work. This could be one of the following: Data Platform, Analytics Engineering, Data Science, BI, or Data Governance.
Champion: The team requesting the work. This may be a functional partner team or the Data Team itself.
Workflow: The current status of the work. All issues should start with the workflow::1 - triage label. The team performing the work is responsible for updating the workflow status as the work progresses.
Priority: The priority level of the work, categorized as follows:
- P1: Operational (highest urgency)
- P2: OKR-related work (objective-driven)
- P3: Other (lower priority or non-urgent tasks)

Effective in January 2025, we use a Bot on the Data Team project to check that the Team, Champion, Workflow, and Priority labels have been applied to issues after 14 days of being opened. The bot sends a reminder in the issue to add the missing labels. The first triage response for adding labels is the team member that opens the issue. The 2nd triage response for adding labels is the Data Analyst, Data Scientist, Analytics Engineer, and Data Engineer that are on triage. Issues that do not have Team, Champion, Workflow, and Priority labels applied after 30 days are automatically closed. If an issue is closed due to not having the required labels, team members have the option to reopen the closed issue and apply the labels to meet the issue refinement requirements.

Optional labels that are useful to communicate state or other priority:

What:
- Data: Data being touched (Salesforce, Zuora, Zendesk, GitLab.com, etc.)
- Tool: (Tableau, dbt, Stitch, Airflow, etc.)
- Pod: Data team pod that is scheduling the work
Business Logic Change: This label is applied for any business logic changes such as adding new dimensions, facts, marts, changing joins, adding new calculated fields.
Opportunity Canvas: This label is auto-applied on the Opportunity Canvas template, but can also be applied to work that has converted into a large-scale project. This label will be used to identify topics for discussion and prioritization at the monthly Data Steering Committee.

Epic Labeling

Similar to issue labeling, epic labeling helps the Data team categorize, quantify, and prioritize the projects in our backlog.

At a minimum, all epics should have a Team: label applied. This tags the epic for the Data Team that is primarily responsible for performing the work, and enables managers to review the backlog of projects for their respective teams. This is particularly helpful during quarterly planning.

Because the epic list cannot easily be filtered to parent epics only (and epics may be nested under other epics as sub-epics), we use an additional label to distinguish between epics that are being considered for OKR-level commitments (these should have the Opportunity Canvas label applied) vs. those that are used to group related issues under a general theme (these should not have the Opportunity Canvas label applied).

Merge Request Workflow

Ideally, your workflow should be as follows:

Confirm you have access to the analytics project. If not, request Developer access so you can create branches, merge requests, and issues.
Create an issue, open an existing issue, or assign yourself to an existing issue. The issue is assigned to the person(s) who will be doing the work.
Add appropriate labels to the issue (see above)
Open an MR from the issue using the “Create merge request” button. This automatically creates a unique branch based on the issue name. This marks the issue for closure once the MR is merged.
Push your work to the branch
Update the MR with an appropriate template. Our current templates are:
- dbt Changes - used for any change involving dbt. Analysts will most often use this one
- add_manifest_tables - for adding tables to pgp extract
- periscope - for getting a Periscope dashboard reviewed
- python_changes - for general changes to Python code
- All Other Changes - for work that doesn’t generally fall into these categories
Run any relevant jobs to the work being proposed
- e.g. if you’re working on dbt changes, run the job most appropriate for your changes. See the CI jobs page for an explanation of what each job does.
Document in the MR description what the purpose of the MR is, any additional changes that need to happen for the MR to be valid, and if it’s a complicated MR, how you verified that the change works. See this MR for an example of good documentation. The goal is to make it easier for reviewers to understand what the MR is doing so it’s as easy as possible to review.
Request a review by assigning the MR to a peer using the Merge Request Reviewer feature.
- Requesting a review in this manner indicates to the person that you would like their code review and approval if everything is good. This does not mean they will merge the MR if they approve it.
- The peer reviewer should use the native approve button in the MR after they have completed their review and approve of the changes in the MR.
- After approval, the reviewer can unassign themselves from the Reviewer list. The reviewer is not responsible for the final tasks. The author is responsible for finalizing the checklist, closing threads, removing Draft, and getting it in a merge-ready state.
If the MR is approved, remove the Draft: label, mark the branch for deletion, mark squash commits, and assign to the project’s maintainer. Ensure that the attached issue is appropriately labeled and pointed.
- Generally, assigning the MR to a maintainer indicates you would like for them to merge it if there are no issues. If the maintainer needs to approve the merge request before merge as part of a CODEOWNER group, then they will do a review before merging. Otherwise, they will simply merge. If you would like the maintainer’s review regardless, simply leave a comment to that effect.
- Note that assigning someone an MR means action is required from them.
- If Draft: is still in the title of the MR, then the Maintainer will assign the MR back to the author to confirm that the MR is ready for merge.

Other tips:

The Merge Request Workflow provides clear expectations; however, there is some wiggle room and freedom around certain steps as follows.
- For simple changes, it is the MR author who should be responsible for closing the threads. If there is a complex change and the concern has been addressed, either the author or reviewer could resolve the threads if the reviewer approves.
Reviewers should have 48 hours to complete a review, so plan ahead with the end of the iteration.
When possible, questions/problems should be discussed with your reviewer before submitting the MR for review. Particularly for large changes, review time is the least efficient time to have to make meaningful changes to code, because you’ve already done most of the work!
Consider bringing the latest commits from the primary branch so the MR is caught up. You can do this quickly by typing /rebase into a comment and GitLab will make this happen automatically, barring any merge conflicts.

KPI Development Workflow

The Data Team will work to add KPIs and Performance Indicators to our enterprise database models and BI reports once the following steps have been completed.

The DRI should ensure the KPI definition, business logic, and calculation steps are documented in the relevant section of the handbook and added to the GitLab KPIs with all of its parts
The handbook definition should be reviewed by the necessary Consulted & Informed cross-functional partners. In some cases, the definition may also require approval from those cross-functional partners
Once the KPI is ready to be added into our enterprise reporting, the DRI should create an issue using the standard Data Team Issue template on the GitLab Data Team Issue Tracker.
The Data team will verify the data sources and help to find a way to automate (if necessary).

Once the KPI has been added to our enterprise BI platform, the Data Team will present it to the DRI for testing and final approval.

SLO for Issues and Merge Requests

First-Response SLO for a new Issue: 2 business days from the time of issue creation
First-Response SLO for a new MR: 2 business days from the time of submission to the codeowner

When opening an issue or submitting a MR for review, it is good to add a comment bringing the issue or MR to the attention of a DRI or codeowner. Please allow the SLO time period to expire before requesting additional updates or first responses on an Issue or MR. Issues and MRs within the SLO time period are not considered blocked. If the issue or MR is urgent or a break-fix scenario, it is good to follow-up with the team member in the Issue or MR within the SLO period as determined by the required urgency and importance.

Removal and deletion process

After some time, environments will have software/code/components that are not needed any more. It feels risky to delete software and code, even when its not being used, seems not being used or asked not to being used (i.e. users access).

There are multiple reasons to perform deletions:

We see software (or parts of software) that is not used anymore
There is a request to remove user access
There is a request to delete a user account
There is a request to indefinitely stop a data pipeline

To address observations and requests, and ensure that deletion will take place in a controlled manner, open an issue with the Cleanup Old Tech template.

State down in the issue the scope of deletion

Write down what will be deleted and where possible link to existings issues.

Calculate a Risk score

The Risk score is build upon 2 variables.

Probability that it will break something
Impact if the deletion is executed by mistake or is execute wrongly

Each variables will be scored 1 to 3.

Probability	Score
Low	1
Medium	2
High	3

Impact	Score
Negligible	1
Lenient	2
Severe	3

Probability * Impact = Risk Score

	Negligible	Lenient	Severe
Probability
Low	1	2	3
Medium	2	4	6
High	3	6	9

Risk Score	Outcome
1 - 2	Create a MR and have it reviewed by 2 code owners
3 - 4	Create a MR, tag `@gitlab-data/engineers` with a deadline to object and have it reviewed by 2 code owners
6 - 9	Create a MR, to be discussed in the DE-Team meeting and have it reviewed by 2 code owners

Annual Data & Analytics Maturity Assessment

Each year beginning in FY24, the Data Team will facilitate a survey to solicit anonymous feedback about the Data program from GitLab team members. This survey will help drive our focus areas and track our maturity over time.

YouTube

We encourage everyone to record videos and post to GitLab Unfiltered. The handbook page on YouTube does an excellent job of telling why we should be doing this. If you’re uploading a video for the data team, be sure to do the following extra steps:

Add data as a video tag
Add it to the Data Team playlist
Share the video in #data channel on slack

Data Hiring and Interview Process

Review the People Interviewing Guide to ensure you are up-to-date on the current process and policies.
Contact your People BP (Business Partner) and alert them of the opening. The People BP is responsible for managing the job opening, requisition, and interview processes.
Ask your People BP to provide you a Requisition # for tracking purposes. If you have multiple openings to fill, communicate with your People BP using this Req# because juggling multiple openings gets confusing fast.
Develop an Interview Plan, which will cover Responsibilities, Tips, Reminders, and custom Interview questions for each Interview Job Role type. For examples, see the Data Scientist Interview Plan and Analytics Engineer Interview Plan.
Each Data Job Role and Job Grade has a customized Homework Assessment. Review and update the Homework Assessment as needed. If a Homework Assessment is not available for the Job Role, create one and save it in the Homework Assessment Google Drive
Send a Homework Assessment to the People BP for inclusion in the Hiring process. The Hiring Process is included in each Data Job Family to help set candidate and interviewer expectations. An example is the Data Engineer Hiring Process.
Create a new slack channel, e.g. bt-data-data-science-interview to help coordinate with your interviewers. Share the Interview Plan with your interviewers through Slack.

View page source - Edit this page - please contribute.