Data for Design Decisions

Guide on tools used within GitLab to find the data to support design decisions.

Using data for design decisions at GitLab

Data is another way designers and researchers at GitLab can understand user behavior. Analytics can provide valuable input throughout the Product Development Flow. By using data, we can understand and quantify the impact of the iteration that we shipped.

We should not depend entirely on data to make decisions, but it should be an essential input to decision making. To learn more about quantitative data/research, see the Using Data to Find Insights handbook page.

Aligning hypothesis to impact

Part of the design process is to have a strong hypothesis to guide our work.

Ideally, the hypothesis will be based on information from user research.

For example:

We believe storing information about how an incident was resolved, how long the resolution took, and what the outcome was in a way that's easy for engineers responsible for incident management to access will achieve a 20% faster resolution time for incidents.

Here are possible ways one could use data to understand whether a 20% faster resolution time for incidents was achieved or not:

  • Measure time between two steps in the user journey
  • Measure the total time spent for resolution
  • Conduct an A/B test to compare the two solutions

These data points would be hard to obtain during solution validation but when measured they help connect the dots from research, iteration, to impact.

By observing and measuring, it should spark further questions to help generate more possible iterations in the future.

How is data being captured

To generate reports and dashboards, we use a third party tool called Tableau to visualize the data captured.

The data source determines the table names used in Tableau queries. We have three primary data sources that are useful from a product perspective: service ping, product database, and internal events.

Our goal is to analyze product usage. NOT to track individual users. This means on the frontend we respect browser settings of “do not track” and allow opting out of usage ping. In addition to that, the Analytics Instrumentation team is responsible for data pseudonymization so that there no personally identifiable information saved. This video highlights how Snowplow, usage ping, and pseudonymization work together.

Overview of the data sources

  • Service Ping (for Self-Managed and GitLab.com)
    • A custom GitLab tool to collect aggregate information from our customers who host our product on their own hardware.
    • Video: Usage Ping Workshop
    • Examples of when to use:
      • Total number of issues
      • Count of distinct users creating issues
      • Instance settings - Git version, database version
      • Count of feature enabled
      • Count of created notes on Snippets
      • Count of notes on Merge Request
  • GitLab.com Postgres Database (for GitLab.com)
  • Internal Events (for GitLab.com)
  • deprecated Snowplow (for GitLab.com)

Key Data Sources for Product Managers at GitLab elaborates on how each data source is used and queried.

These visualizations will help you understand how the systems work together:

Examples of using data for design decisions

The issues and merge requests below are examples of how we have used data for decisions.

Frequently asked questions

  • Who can I ask for data help? If you have any specific questions around Data or Tableau, you can connect with them on Slack in #data.
  • What happens when there is no baseline metric to measure from? When there is no baseline, use the data that is to be tracked as the baseline after a month of data.

Resources

Last modified September 23, 2024: Fix broken links (d748cf8c)