Monitor:Platform Insights Group

Who we are?

The Platform Insights group is part of the GitLab Monitor stage and builds GitLab Observability and Product Analytics products.

Team members

Name Role
Nicholas KlickNicholas Klick Engineering Manager, Monitor:Platform Insights
Ankit BhatnagarAnkit Bhatnagar Staff Backend Engineer, Monitor:Platform Insights
Arun SoriArun Sori Senior Backend Engineer, Monitor:Platform Insights
Daniele RossettiDaniele Rossetti Senior Frontend Engineer, Monitor:Platform Insights
Jiaan LouwJiaan Louw Senior Frontend Engineer, Monitor:Platform Insights
Mat AppelmanMat Appelman Principal Engineer, Monitor
Max WoolfMax Woolf Staff Backend Engineer, Monitor:Platform Insights
Robert HuntRobert Hunt Staff Frontend Engineer, Monitor:Platform Insights

Stable counterparts

Name Role
Principal EngineerPrincipal Engineer Principal Engineer, Monitor
Lindsy FarinaLindsy Farina Senior Product Manager, Monitor:Platform Insights
Ottilia WesterlundOttilia Westerlund Security Engineer, Fulfillment (Fulfillment Platform, Subscription Management), Security Risk Management (Security Policies, Threat Insights), Monitor (Observability), Plan (Product Planning), AI-powered (Duo Chat, Duo Workflow, AI Framework, AI model validation, Custom models)

Technical Architecture

Architecture Blueprints

Architecture Documentation

ClickHouse Datastore

Observability and analytics features have big data and insert heavy requirements which are not a good fit for Postgres or Redis. ClickHouse was selected as a good fit to meet these features requirements. ClickHouse is an open-source column-oriented database management system. It is attractive for these use cases because it can efficiently filter, aggregate, and sum across large numbers of rows. ClickHouse is not intended to replace Postgres or Redis in GitLab’s stack.

We initially managed our own self-hosted Clickhouse instance, but decided to migrate to Clickhouse Cloud to enable the team to move quicker by offloading maintenance and scalability to Clickhouse.

Learn more: Clickhouse Datastore Working Group

How we work?

We base our workflow on the company’s Product Development Flow. Any modifications or clarifications on how we apply the workflow are detailed below.

Async Standups

We have slack-based standups (using Geekbot) on Wednesdays and retrospectives on Fridays. We use these async standups to communicate what we have accomplished, any current blockers and what we plan to work on next.

Async Updates

Every Friday, the EM provides an async update of the team’s progress, following the Ops sub-department async updates process.

These updates are published as issues in the general project.

Updates and highlights from all teams in Ops are collected automatically here, grouped by week / month / quarter.

Meetings

  • Weekly Team Sync: These are focused on organizing ongoing work or specific efforts such as rollout-outs or bigger initiatives.
  • Bi-monthly social hour: This meeting is non-work related and helps team socialize and get to know each other better.
  • Team member coffee chats: Each team member should schedule a coffee chat with all other team members rough every 4-6 weeks. Feel free to discuss work or non-work topics. If timezones are an issue find another way to connect, such as a async slack thread to checkin. The goal is to get to know your other team members on a 1:1 basis.
  • Dev Syncs: These are developer-organized sync meetings where ICs can meet and discuss technical issues or organize technical work amongst themselves without requiring the presence of a EM.

Communication

We use several Slack channels to organize ourselves:

How we do planning?

We are following the monthly milestone cadence. Work is organized into epics and assigned to the relevant milestones.

Milestone starting date is defined in gitlab.org group milestones. It changes every month, according to the new GitLab release calendar.

Milestone Planning timeline:

  • 10 days before milestone starting date: Planning draft issue is created by PM/EM, with high level milestone goals.
  • 8 days before milestone starting date: Planning draft is shared with team. Individual contributors recommend epics and issues related to these goals or carried over from previous milestones.
  • 5 days before milestone starting date: Planning is reviewed during team sync meeting.
  • On milestone starting date: Milestone goals and related epics and issues should be finalized and prioritized. All planned work can be seen on the Milestone Board Previous milestone issues are moved to the new milestone or backlog.
  • During the milestone, we analyze progress and reprioritize as needed.

Issue prioritization

Our priorities should follow overall guidance for Product. This should be reflected in the priority label for scheduled issues:

Priority Description Probability of shipping in milestone
priority::1 Urgent: top priority for achieving in the given milestone. These issues are the most important goals for a release and should be worked on first; some may be time-critical or unblock dependencies. ~100%
priority::2 High: important issues that have significant positive impact to the business or technical debt. Important, but not time-critical or blocking others. ~75%
priority::3 Normal: incremental improvements to existing features. These are important iterations, but deemed non-critical. ~50%
priority::4 Low: stretch issues that are acceptable to postpone into a future release. ~25%

How to find something to work on?

Normally at the beginning of the Milestone the EM will discuss an overview of the work and what relevant areas you will focus on. Sometimes issues will already be assigned to you before the Milestone begins.

If you are ever looking for additional issues to work on:

  1. Look at the Platform Insight Milestone board
  2. Identify an issue that is unassigned.
  3. Assign yourself to the issue.
  4. Add a workflow:in dev label to the issue
  5. If the scope or description are unclear, connect with the EM and or PM for clarification or (if feeling confident) groom the issue yourself and proceed.
  6. Begin working on the issue.
  7. Once all relevent MRs are merged, set the ~workflow::verification label.
    • Ensure any MRs do not auto-close issues. (Use Relates to #11111 rather than Closes #11111 in MR descriptions.)
  8. Verify the changes and comment on the issue which environment you used for verification, for example Verified on production.
  9. Close the issue! 🎉
  10. Repeat.

How to enable Observability Beta for a customer?

To enable access to Logs, Tracing, and Metrics Beta for a certain customer, follow this process:

For SaaS:

  • Before hand, make sure you have the right access and permissions to run ChatOps command as detailed in this page.
  • Ask customer for their top-level group name (example: gitlab-org for https://gitlab.com/gitlab-org/)
  • In #production, run the following commands to enable the feature flags for this group (replace gitlab-org by the customer’s group name):
/chatops run feature set --group=gitlab-org observability_features true

To see the list of groups that have been already enabled, you can run the following command:

/chatops run feature get observability_features

The list returns group IDs and not group names though. To know a group’s ID, browse to the group’s page (example), open the “…” menu on the top-right of the page and select “Copy group ID”. If you don’t have access to the group, ask the customer to do it.

Learn more: see related feature flag issue.

For Self-Managed:

  • not available for now

Dashboards

Last modified January 4, 2025: Fix incorrect or broken external links (55741fb9)