Product Data Insights

Product Data Insights Handbook

The Product Data Insights (formerly known as “Product Analysis”) group consists of a team of product analysts. This group reports to the Senior Director, Product Monetization and serves as a functional analytics team to support the GitLab Product division and product data-related analysis across GitLab.

In addition to supporting the Product division, the Product Data Insights team is an active contributor to the GitLab Data Program. As part of the Research & Development (R&D) Data Fusion Team, the product analysts also work closely with members from the Enterprise Data team. In addition, the Product Data Insights team is part of the Functional Analytics Center of Excellence (FACE), along with other functional analytics groups across the GitLab Data Program.

Read more about what we do at GitLab on our Direction page.

Team members

Product Data Insights is a small (but mighty) team. In order to support the Product division, each analyst is assigned to one or more sections or teams to support.

Name	Title	Product Section or Team
Carolyn Braza	Senior Manager, Product Data Insights	Analytics
Nicole Hervas	Senior Product Analyst	CI, CD
Emma Neuberger	Senior Product Analyst	Growth, Core Platform, SaaS Platforms
Matthew Petersen	Senior Product Analyst	Dev, Data Science
Dave Peterson	Staff Product Analyst	Sec
Neil Raisinghani	Senior Product Analyst	Fulfillment, Pricing

Handbook contents

Working with us

The Product Data Insights group works in two-week iterations, which dictate how and when we plan and prioritize work. Iterations start on Thursdays and end on Wednesdays.

You can see our current iteration here.

Issue intake

For all Product Data Insights requests, please create an issue in the Product Data Insights project, apply the Team::PDI and product data insights labels, and follow the guidelines below.

All data issues with the Team::PDI label will appear on the Product Data Insights board.

Issue templates

Please select the appropriate template based on your type of request and answer as many of the questions as you can. The more information and context we have up front, the faster we are able to triage and begin work on the issue.

Request Type	Template
Ad hoc / Default request	Ad Hoc Request
Create new or update existing PI chart	PI Chart Help
Experimentation analysis	Experiment Analysis Request
Iteration planning	Iteration Planning

Submission due date

In order to be considered for the upcoming iteration, please open all issues by EOD Monday before the next iteration begins. We understand that urgent matters come up, but please try to adhere to the submission due date for any planned work.

Issue prioritization

The PDI team takes three different priorities into account when scheduling work, each captured by a scoped label:

section-priority::
pm-priority::
pdi-priority::

Section priority

Section priority is ultimately determined by section leaders and product leadership. It should align with overall product investments and business impact of the work. Analysts have default guidelines to assign section priority, which then becomes part of the prioritization discussion with section leadership.

Section priority labels will be added by product analysts or section leadership.

Label	Definition
`section-priority::1`	Work supporting high-impact, high-investment stages or groups; cross-stage analysis
`section-priority::2`	General work supporting the section; help with PI charts; etc
`section-priority::3`	Low-impact work; “nice-to-haves”

Most issues will fall under section-priority::1 and section-priority::2.

PM priority

PM (Product Manager) priority captures the relative priority of the issue compared to any other issues that the PM (or their group) have open in the backlog (if applicable). In general, the issues that are more immediately actionable and impactful to the company KPIs should be higher in priority.

We ask that PMs apply a pm-priority:: label to issues to indicate relative priority of the request.

Label	Priority
`pm-priority::1`	High and/or urgent
`pm-priority::2`	Medium
`pm-priority::3`	Low, non-urgent

PDI priority

PDI priority is owned by the individual analyst. The Senior Manager, Product Data Insights will help refine priority based on importance and capacity. The team will work with the Senior Director, Product Monetization and/or Product leadership on trade-offs (if needed).

Section priority and PM priority are both inputs in determining PDI priority. However, the scope of the team’s work extends beyond section support (ex: cross-functional initiatives), and therefore there are other considerations, as well.

PDI priority labels will be added by the Product Data Insights team as a part of triage and planning.

Label	Priority
`pdi-priority::1`	High / Urgent Priority Any analysis requests that are required to be completed within the current iteration. All requests that have Priority 1 should have a direct KPI and/or OKR that will be affected by the analysis.
`pdi-priority::2`	Medium Priority This is where most requests would fall into. This can be any net-new analysis, reporting (dashboard creation), or exploratory analysis that is needed for decision making.
`pdi-priority::3`	Low Priority / Consultant: This is for any analysis that does not have an immediate direct action as a result of the analysis and/or other low-level, non-urgent requests that can be placed in an analyst’s backlog.

Most issues will fall under pdi-priority::2 and pdi-priority::3.

Iteration planning

Final commitment and prioritization will occur during the iteration planning meeting, which takes place the day before an iteration begins (every other Wednesday). The team will consider new and existing issues, along with issues in progress. When selecting issues for the next iteration, the team considers the following:

Issue priority
Issue weight
Velocity
Working days
Target work breakdown

Issue priority

Analysts will use the different priority labels (as defined above) as inputs into planning.

Issue weight

Each issue is assigned a weight based on estimated time commitment.

If a single issue has a weight greater than the length of the iteration (2 weeks), it should be broken into smaller units of work. (This could also be an indicator that the issue should be converted to an epic).

Weight Value	Estimated time to complete
1	< 1 hour
2	2 hours
3	4 hours
5	1 day
8	2-3 days
13	1 week
21	2+ weeks
34	1+ month

Velocity

Product Data Insights defines velocity as the amount of work (measured in issue weight) completed by the team within a given iteration. While we recognize that this is an imperfect measurement (partially-completed issues and undocumented work are not accounted for), it is a rough gauge of team output.

We aim to only commit to work we believe can be completed within the 2-week iteration. As such, we will commit to less than our recent velocity and leave a buffer to account for urgent issues and interruptions. To start, each analyst will leave a buffer of ~2 days worth of work (an estimate based on the recent volume of unplanned work). High-priority issues exceeding the allotted buffer will have a material impact on our ability to complete planned work, so please plan ahead if you know that you will need assistance from the Product Data Insights team.

Working days

As GitLab team members, we are encouraged to take PTO and observe public holidays in order to maintain a healthy work-life balance. Analyst capacity should be adjusted based on the number of days they are working in the iteration.

Target work breakdown

Product Data Insights groups work into three different buckets. During planning, analysts will aim to maintain the following breakdown:

Section-level priorities: 50%
- Backlog DRI: Section leader and/or product leadership
- This work can be identified with the section-priority::1 label
Other section work: 25%
- Backlog DRI: Product analyst
- This work can be identified with the section-priority::2 or section-priority::3 labels
Cross-functional work and special projects: 25%
- Backlog DRI: Senior Manager, PDI (based on the recommendations from the product analysts)

The distribution of work will vary from iteration to iteration, but the 50/25/25 breakdown is the target state.

Urgent issues

If an urgent matter comes up, please open an issue and tag your analyst contact (and/or @cbraza). Please include why the issue is urgent, when it is needed by, what it will inform or how it will be used, and who is the intended audience.

If you have not heard from the tagged analyst within 1 business day* (or earlier if the issue requires a faster turn-around), please send a message in #data and feel free to tag @cbraza.

*Please keep in mind that we work across different time zones

Additional considerations

Please keep the following in mind when working with the Product Data Insights team:

Scope creep

Scope creep is a problem everyone faces. Please keep in mind that team capacity is a zero-sum game, so scope creep in one issue can mean that we are unable to complete other work planned for that iteration.

Additional scope (change in requirements, additional follow-ups, etc) that adds a material amount of work* to an issue will need to be captured as net-new work in a new issue. The new issue will then go through the normal prioritization and planning process. The best way to avoid scope creep is to have thorough, complete requirements in the issue when you initially open it. The issue templates should help guide you to include all relevant information.

*The threshold for a “material amount of work” is to be determined by the analyst working on the issue.

Blocked issues

If an issue is blocked and it requires additional work* to diagnose or troubleshoot (ex: a data issue is uncovered), a new issue should be opened, assigned a weight and priority, and linked to the original blocked issue. The new issue can be added to the current iteration without going through the formal planning process at the analyst’s discretion, but this can impact our ability to complete all issues in the iteration.

*The threshold for “additional work” is to be determined by the analyst working on the issue.

Experimentation analysis issues are naturally blocked by the experiment actually running (we have to wait until we have sufficient data in order to perform the analysis). In order to enable a more accurate measure of velocity, we will divide the work into 2 separate issues*:

Experiment prep (dashboard creation and data validation)
Experiment analysis

*At this time, PMs should continue to open a single issue and analysts will separate accordingly.

Undocumented requests

Please open an issue for all Product Data Insights requests. Requests made via comments in Google Drive are extremely difficult to track, and Slack history is gone after 90 days. In addition, these requests are not planned or accounted for in team velocity. Your informal request might mean that we are unable to complete work we committed to for another stakeholder.

By keeping a formal record of requests (via issues), we are able to identify frequently asked questions (which could lead to building a self-service solution) and quickly replicate past work.

Capacity

The Product Data Insights team is tasked with supporting the entire Product organization, in addition to other product-related data needs across GitLab. As such, team capacity can be limited as we grow towards our target gearing ratio. However, limited capacity should not stop GitLab team members from opening issues for Product Data Insights, it simply means that lower-priority requests will have to wait until resources are available. As the group grows, so will our ability to turn around requests in a shorter period of time.

Engaging the team

In addition to opening an issue, there are several other methods to engage with the team, both sync and async.

Office hours

In order to support more PMs across GitLab, the Product Data Insights team offers office hours. Office hours are held every other Wednesday, alternating between 4 pm UTC (11 am ET / 8 am PT) and 9 pm UTC (4 pm ET / 1 pm PT). While our primary stakeholders are PMs across the organization, all GitLab team members are welcome to join.

The intent of office hours is to give PMs faster access to the team and get support for smaller tasks, brainstorming, and data self-service. More formal requests that answer more complex questions are captured in issues and go through a more rigorous, structured prioritization process.

The agenda is first-come, first-served. Walk-ins/drop-ins are always welcome, but if possible, please add your name and topic (or question) before office hours begins. This allows the team time to review new agenda items ahead of time. If we are unable to cover a topic on the agenda, it will be pushed to the following meeting.

Stakeholders are welcome to leverage office hours to discuss and define new issues, which can help reduce async back-and-forth communication in the issue itself.

Example topics

Office hours are intended for smaller bodies of work, brainstorming, and assistance with data self-service. Here are some examples of topics for office hours:

Example Topics

👍 Example Topic 1: Experiment Setup

I am interested in launching an experiment to see if we can increase adoption of Secure.

How would you go about setting the experiment up?
Can you help me calculate the sample size?
Can you help me interpret the results?

👍 Example Topic 2: Approach to Analysis

I am trying to do an analysis on the relationship between users with SSO enabled and invite acceptance rate.

Which tables should I use? Can you help me understand this data source?
What approach would you take?
Would this metric answer the question?
Can you help me understand this data source?

👍 Example Topic 3: Code Review

I wrote a query to calculate xMAU for namespaces that converted from a trial to a paid plan.

Is this JOIN correct?
Does this logic only include namespaces that had trials before converting?

Note: We are not able to accommodate all code reviews in the scope of office hours. Please limit this type of topic to specific aspects of a query, whether you are using the correct data source, etc.

👍 Example Topic 4: Dashboard Updates

I am looking to make some updates or enhancements to this existing dashboard.

Can you help me incorporate a filter into this dashboard that would allow me to limit the charts to activity within 30 days of namespace creation?
Can you update this funnel to include this additional event?

👍 Example Topic 5: Follow-Up Questions

In the last key meeting, you presented an analysis on early trial adoption.

Can you walk me through your methodology?
Can you help me understand the implications of the data/analysis?

👍 Example Topic 6: Scope and Define New Issue

I am going to open an issue for a new analysis.

Can we discuss the overall scope and details?
What kind of information should I include in the issue?

If the topic you bring is too broad to be addressed during office hours, we can discuss the work and will redirect you to open an issue.

FAQs

What is the difference between topics for office hours and formal data requests?

Office Hours is intended to help PMs with smaller tasks, provide a venue for brainstorming, and help folks looking to learn more about data self-service. The benefit is that the agenda is first-come, first-served, the prioritization process is bypassed, and the wait time is minimal.

Formal data requests and larger bodies of work are captured in issues in the Product Data Insights project. They can help answer more complex questions, but go through more robust intake and planning processes. As such, there is a longer turn-around time given team size and capacity.

What if I don’t know if my topic is best suited for office hours or whether I need to open an issue?

Feel free to ask your analyst partner (if applicable) or in #data. When in doubt, come to office hours and the team can discuss there.

Slack

Channels

#data - For any type of data question, including those related to product and/or the Product Data Insights team
#data-tableau - For any questions related to Tableau
#g_product_data_insights_daily - For the Product Data Insights team’s asynchronous daily stand-up, powered by Geekbot

Aliases

@product-analysts - Notifies the entire Product Data Insights team
@randdanalyticstriage - Notifies the entire Product Data Insights team and the Data team’s R&D Fusion group, per the Enterprise Data Triage Program
@functional-data-analysts - Notifies the entire Product Data Insights team and other functional analysts across the GitLab Data Program

GitLab groups

@gitlab-data/product-analysts - Notifies the entire Product Data Insights team

YouTube playlist

Product Data Insights - Recordings from office hours, analysis/read-outs, etc

Other helpful resources & links

Crash Course for Product Stage Resources

Objectives for this page This page is intended to provide a crash course style overview of the most important Product Analytics related resources for each product Stage. As a Product Analyst or other curious GitLab team member, it can be helpful to have a quick and easy reference for each product Stage to quickly understand high-level functionality, key objectives or a distilled product roadmap, and key data resources currently used under a specific Stage or Group within GitLab before jumping in to an analysis.

dbt Cheat Sheet

data build tool (dbt) Cheat Sheet for Functional Analysts

Engineering Dashboarding and Metrics

Engineering Analytics Dashboard Inventory Several dashboards have been published to the Engineering project in the Tableau environment. Below is a brief overview of some of the dashboards created and where you can find them. Centralized Engineering Metrics Please refer to our Centralized Engineering Metrics page here. Tableau Dashboards You can find published dashboards in Ad-hoc/Development/General. These dashboards are safe for general use by the Tableau User population here at GitLab.

Engineering Metrics Dashboards

Welcome Welcome to our Engineering Metrics Dashboards hub – your go-to spot for checking out how things are rolling across our engineering org. The dashboards below capture data on key metrics such as past due issues, merge request types, open bugs, review time, merge request rate, and the age of bugs and issues. These metrics serve as vital indicators, offering a granular understanding of our development processes, code quality, and team efficiency.

Experimentation Design & Analysis

Overview At GitLab, we have a unique approach to experimentation that is built in-house by our incredible development team. The reason we use this approach is to uphold our commitment to our users and customers to protect their privacy. This custom approach leads to some challenges that are not experienced with more commonly used 3rd party experimentation tools. Due to this reality, experimentation at GitLab must be approached with a high level of intentionality and forethought.

Guide to Engineering Analytics Data

Introduction Product Data Insights is responsible for building and evolving analytics capabilities and creating insights for Engineering to understand how well we are building our product. In this case, “wellness” is measured in terms of efficiency, as well as cost. Data Sources Dive into our analytics by exploring the specific data sources that underpin our metrics. GitLab.com data is used for reporting on metrics like MR Rate & Performance KPIs Workday is GitLab’s current central HRIS and we use this data to determine which group a team member is a part of.

PDI Dashboards, Analysis, & Insights

This page aggregates dashboards, analysis, and insights generated or owned by the Product Data Insights team.

Product Data Insights Data Models Cheat Sheet

Objectives for this page This handbook page is intended to provide a high-level overview of the most common data models used by the Product Data Insights team as well as some known nuances and/or caveats about those data models that are helpful to be aware of. To collaborate on the content in this page, please either submit an MR (preferred) or start a discussion in this Epic. Helpful places to start DBT Docs - This resource contains comprehensive documentation on all available dbt models.

Team Processes

Issue hygiene Must-haves All issues must have the following: Team::PDI label This is the label used to identify and track the team’s work product data insights label Workflow label (ex: workflow::1 - triage) PDI priority label (ex: pdi-priority::2) Section priority label (ex: section-priority::1) Section label (ex: section::dev) Answers the question “what section is this work supporting?” Stage label (ex: devops::create) Answers the question “what stage is this work supporting?” Group label (ex: group::code creation) Answers the question “what group is this work supporting?

Last modified June 27, 2024: Fix various vale errors (46417d02)

View page source - Edit this page - please contribute.

Product Data Insights