Guide to Engineering Analytics Data

Overview of key Engineering data sources and data models

Introduction

Product Data Insights is responsible for building and evolving analytics capabilities and creating insights for Engineering to understand how well we are building our product. In this case, “wellness” is measured in terms of efficiency, as well as cost.

Data Sources

Dive into our analytics by exploring the specific data sources that underpin our metrics.

  • GitLab.com data is used for reporting on metrics like MR Rate & Performance KPIs
  • Workday is GitLab’s current central HRIS and we use this data to determine which group a team member is a part of.
  • Zendesk data is used to fuel Customer Support metrics.

How We Count MRs and Issues

For most engineering metrics, we only report on internal projects - specifically the ones that impact our product. What counts as “internal” is defined in two files:

If a project or namespace isn’t listed in one of those CSVs, we don’t pull data for it. That means stuff like titles, labels, and descriptions won’t show up. If you want to include a project or group, you’ll need to open an MR and ping an analyst. Only projects and namespaces specifically listed in the file will be included in the data (child groups needed to be added separately). Once the MR is merged, data for the new group/project will start getting pulled from that point forward (no historical backfill). We also need to do a full refresh of the source and downstream models (backfill issue example). Click here to read more about what “internal” is. After that, we narrow our engineering metrics down even further to focus on projects that directly affect our product. Those are listed in the file below.

Right now, the following namespaces are included in the metrics:

Namespace name Namespace path
GitLab.org gitlab-org
GitLab.com gitlab-com
GitLab Chef Cookbooks gitlab-cookbooks
GitLab components components

Commonly Asked Questions

Question Solution
I don’t see any issues for my project Project needs to be included in the seed files mentioned above.
My issues are showing up but with no metadata Only data for internal projects are ingested.
I don’t see labels flowing through Check which group/project the label is created at. The group/project should be listed in the seed files.

Data Models

In this section, we share commonly used data models that fuel many of our dashboards.

workspace_engineering.engineering_merge_requests

workspace_engineering.internal_merge_requests

  • Description: This table is filtered down to all internal merge requests at GitLab
  • Granularity: One row per merge request
  • Documentation: DBT docs

workspace_engineering.engineering_issues

  • Description: This table is filtered down to all issues that directly affect our product.
  • Granularity: One row per issue
  • Documentation: DBT docs

workspace_engineering.internal_issues

  • Description: This table is filtered down to all internal issues at GitLab
  • Granularity: One row per issue
  • Documentation: DBT docs

workspace_engineering.internal_notes

  • Description: Table containing GitLab.com notes from Epics, Issues and Merge Requests. It includes the namespace ID and the ultimate parent namespace ID.
  • Granularity: One row per issue
  • Documentation: DBT docs

workspace_engineering.agg_mttr_mttm

  • Description: This table calculates Mean Time to Resolve (MTTR) and Mean Time to Merge (MTTM)
  • Granularity: One row per issue
  • Documentation: DBT docs

workspace_engineering.issues_history

  • Description: Table containing age metrics & related metadata for gitlab.com internal issues. Used for tracking internal work progress for things like Engineering Allocation & Corrective Actions These metrics are available for individual issues at daily level & can be aggregated up from there
  • Granularity: One row per issue and day
  • Documentation: DBT docs

workspace_engineering.merge_request_rate

  • Description: A model containing merge request rate by department and group.
  • Granularity: One row per MR rate per month per granularity level (department, group)
  • Documentation: DBT docs

workspace_engineering.open_merge_request_review_time

  • Description: A model containing merge request rate by department and group.
  • Granularity: One row per day per MR
  • Documentation: DBT docs

Zendesk Data

PREP.zendesk.zendesk_ticket_audits_source

  • Description: SLA policies and priority per ticket
  • Granularity: One row per audit
  • Documentation: DBT docs

PREP.zendesk.zendesk_tickets_source

  • Description: Zendesk ticket data
  • Granularity: One row per audit
  • Documentation: DBT docs

PREP.zendesk.zendesk_ticket_metrics_source

  • Description: Zendesk ticket data
  • Granularity: One row per audit
  • Documentation: DBT docs

PREP.zendesk.zendesk_sla_policies_source

  • Description: SLA policies
  • Granularity: One row per audit
  • Documentation: DBT docs

workspace_engineering.zendesk_frt

  • Description: A model built to calculate First Reply Time (FRT) metric.
  • Granularity: One row per Zendesk ticket
  • Documentation: DBT docs

Additional Resources

Repo Shortcuts

If you have any questions, please feel free to drop them in #g_engineering_analytics or open a new issue for our team.