Monitor:Observability Group
Who we are?
The Observability group is part of the GitLab Monitor stage and builds GitLab Observability product.
Team members
Name | Role |
---|---|
Nicholas Klick | Engineering Manager, Monitor:Observability |
Ankit Bhatnagar | Staff Backend Engineer, Monitor:Observability |
Arun Sori | Senior Backend Engineer, Monitor:Observability |
Daniele Rossetti | Senior Frontend Engineer, Monitor:Observability |
Jiaan Louw | Senior Frontend Engineer, Monitor:Product Analytics |
Mat Appelman | Principal Engineer, Monitor |
Max Woolf | Staff Backend Engineer, Monitor:Platform Insights |
Robert Hunt | Staff Frontend Engineer, Monitor:Product Analytics |
Stable counterparts
Name | Role |
---|---|
Principal Engineer | Principal Engineer, Monitor |
Ottilia Westerlund | Security Engineer, Fulfillment (Fulfillment Platform, Subscription Management), Security Risk Management (Security Policies, Threat Insights), Monitor (Observability), Plan (Product Planning), AI-powered (Duo Chat, Duo Workflow, AI Framework, AI model validation, Custom models) |
Technical Architecture
Architecture Blueprints
Architecture Documentation
- See this page
Project Links
ClickHouse Datastore
Observability and analytics features have big data and insert heavy requirements which are not a good fit for Postgres or Redis. ClickHouse was selected as a good fit to meet these features requirements. ClickHouse is an open-source column-oriented database management system. It is attractive for these use cases because it can efficiently filter, aggregate, and sum across large numbers of rows. ClickHouse is not intended to replace Postgres or Redis in GitLab’s stack.
We initially managed our own self-hosted Clickhouse instance, but decided to migrate to Clickhouse Cloud to enable the team to move quicker by offloading maintenance and scalability to Clickhouse.
Learn more: Clickhouse Datastore Working Group
How we work?
Async Standups
We have slack-based standups (using Geekbot) on Wednesdays and retrospectives on Fridays. We use these async standups to communicate what we have accomplished, any current blockers and what we plan to work on next.
Async Updates
Every Friday, the EM provides an async update of the team’s progress, following the Ops sub-department async updates process.
These updates are published as issues in the general
project.
Updates and highlights from all teams in Ops are collected automatically here, grouped by week / month / quarter.
Meetings
- Weekly Team Sync: These are focused on organizing ongoing work or specific efforts such as rollout-outs or bigger initiatives.
- Bi-monthly social hour: This meeting is non-work related and helps team socialize and get to know each other better.
- Team member coffee chats: Each team member should schedule a coffee chat with all other team members rough every 4-6 weeks. Feel free to discuss work or non-work topics. If timezones are an issue find another way to connect, such as a async slack thread to checkin. The goal is to get to know your other team members on a 1:1 basis.
- Dev Syncs: These are developer-organized sync meetings where ICs can meet and discuss technical issues or organize technical work amongst themselves without requiring the presence of a EM.
Communication
We use several Slack channels to organize ourselves:
- Primary channel: #g_monitor_platform_insights
- Standup channel: #g_monitor_platform_insights_standup
- Social channel: #g_monitor_platform_insights_internal
How we do planning?
We are following the monthly milestone cadence. Work is organized into epics and assigned to the relevant milestones.
Milestone starting date is defined in gitlab.org group milestones. It changes every month, according to the new GitLab release calendar.
Milestone Planning timeline:
- 10 days before milestone starting date: Planning draft issue is created by PM/EM, with high level milestone goals.
- 8 days before milestone starting date: Planning draft is shared with team. Individual contributors recommend epics and issues related to these goals or carried over from previous milestones.
- 5 days before milestone starting date: Planning is reviewed during team sync meeting.
- On milestone starting date: Milestone goals and related epics and issues should be finalized and prioritized. All planned work can be seen on the Milestone Board Previous milestone issues are moved to the new milestone or backlog.
- During the milestone, we analyze progress and reprioritize as needed.
How to find something to work on?
Normally at the beginning of the Milestone the EM will discuss an overview of the work and what relevant areas you will focus on. Sometimes issues will already be assigned to you before the Milestone begins.
If you are ever looking for additional issues to work on:
- Look at the Platform Insight Milestone board
- Identify an issue that is unassigned.
- Assign yourself to the issue.
- Add a
workflow:in dev
label to the issue - If the scope or description are unclear, connect with the EM and or PM for clarification or (if feeling confident) groom the issue yourself and proceed.
- Begin working on the issue.
- Once all relevent MRs are merged, set the
~workflow::verification
label.- Ensure any MRs do not auto-close issues. (Use
Relates to #11111
rather thanCloses #11111
in MR descriptions.)
- Ensure any MRs do not auto-close issues. (Use
- Verify the changes and comment on the issue which environment you used for verification, for example
Verified on production
. - Close the issue! 🎉
- Repeat.
How to enable Observability Beta for a customer?
To enable access to Logs, Tracing, and Metrics Beta for a certain customer, follow this process:
For SaaS:
- Before hand, make sure you have the right access and permissions to run ChatOps command as detailed in this page.
- Ask customer for their top-level group name (example:
gitlab-org
for https://gitlab.com/gitlab-org/) - In #production, run the following commands to enable the feature flags for this group (replace
gitlab-org
by the customer’s group name):
/chatops run feature set --group=gitlab-org observability_features true
To see the list of groups that have been already enabled, you can run the following command:
/chatops run feature get observability_features
The list returns group IDs and not group names though. To know a group’s ID, browse to the group’s page (example), open the “…” menu on the top-right of the page and select “Copy group ID”. If you don’t have access to the group, ask the customer to do it.
Learn more: see related feature flag issue.
For Self-Managed:
- not available for now
Dashboards
81d291b5
)