Environment Automation Team

Summary

Environment Automation is a team within the Dedicated Group. Our mission is to develop and operate the automated plumbing of the GitLab Dedicated solution.

We follow the same processes as listed on the Dedicated Group, unless a difference exists which is explicitly noted on this page.

Team Members

Name Role
Oriol LluchOriol Lluch Manager, Infrastructure
Andy KnightAndy Knight Staff Site Reliability Engineer, Environment Automation
Ermia QasemiErmia Qasemi Site Reliability Engineer
Muhamed HuseinbašićMuhamed Huseinbašić Site Reliability Engineer, Environment Automation
Samir HafezSamir Hafez Senior Site Reliability Engineer
Nick SkoretzNick Skoretz Site Reliability Engineer
Tania RoblotTania Roblot Senior Site Reliability Engineer, Environment Automation
Name Role
Stephen DenhamStephen Denham Manager, Dedicated:Environment Automation
Brendan McKitrickBrendan McKitrick Senior Site Reliability Engineer
Corey CrossCorey Cross Site Reliability Engineer, Environment Automation
Harpratap SinghHarpratap Singh Site Reliability Engineer
Konst TchernovKonst Tchernov Senior Site Reliability Engineer
Riccardo TrivellatoRiccardo Trivellato Site Reliability Engineer, Environment Automation
Stephan BreitrainerStephan Breitrainer Senior Site Reliability Engineer, Environment Automation
Veronica MondoVeronica Mondo Senior Site Reliability Engineer, Environment Automation

Working with us

To engage with the Environment Automation team:

How We Work

Our preference is to work asynchronously, within our project issue tracker as described in the project management section.

The team also has a set of regular synchronous calls:

  1. Environment Automation Team Sync (alternate weeks):
    1. EMEA/AMER: Tue 15:00 UTC (Good for EMEA and US East)
    2. PST/APAC: Wed 00:00 UTC (Good for APAC and US West)
  2. Dedicated on GCP - Weekly Demo: Wed 07:30 UTC
  3. Dedicated Group Demo

Reviewer roulette

Reviewer roulette is an internal tool for use on GitLab.com projects that randomly picks a maintainer + reviewer. Environment Automation uses it to spread the MR review workload. To do so:

  1. Go to the reviewer roulette page.
  2. Click on Spin the wheel.

See the full MR process.

Example responses

Here are some concrete examples of responses to capacity planning alerts.

  • Removing a metric from capacity planning - Advanced search memory pressure does not follow long-term trends and was not a useful prediction. It remains a metric that is alerted on if it exceeds practical limits.
  • Remove saturation metric entirely - kube_pool_cpu was incorrect in many cases, and difficult to get right. It needed to be replaced with a different saturation metric (node-based CPU).
  • Add Saturation metrics - Kubernetes PVCs were not being monitored at all, leading to near-miss incidents
  • Fix the saturation metric - Advanced search disk was inaccurate and needed to be replaced with better promql expressions