Environment Automation Team
Summary
Environment Automation is a team within the Dedicated Group. Our mission is to develop and operate the automated plumbing of the GitLab Dedicated solution.
We follow the same processes as listed on the Dedicated Group, unless a difference exists which is explicitly noted on this page.
Team Members
Working with us
To engage with the Environment Automation team:
- Create an issue in the GitLab Dedicated team issue tracker
- Label the issue with:
workflow-infra::Triage
team::Environment Automation
- When creating an issue, it is not necessary to
@
mention anyone - In case you want to get attention, use a specific team handle (Ex: @gitlab-dedicated/environment-automation ) as defined in Dedicated group hierarchy
- Slack channels
- For Environment Automation specific questions, you can find us in #g_dedicated-environment-automation-team
- Our Slack group handle is
@dedicated-envauto-team
- Our Slack group handle is
- The Dedicated Group as a whole leverages: #g_dedicated-team
- Other teams in Dedicated group have their own work channels for team work discussions:
- For Environment Automation specific questions, you can find us in #g_dedicated-environment-automation-team
How We Work
Our preference is to work asynchronously, within our project issue tracker as described in the project management section.
The team also has a set of regular synchronous calls:
- Environment Automation Team Sync (alternate weeks):
- EMEA/AMER: Tue 15:00 UTC (Good for EMEA and US East)
- PST/APAC: Wed 00:00 UTC (Good for APAC and US West)
- Dedicated on GCP - Weekly Demo: Wed 07:30 UTC
- Dedicated Group Demo
Reviewer roulette
Reviewer roulette is an internal tool for use on GitLab.com projects that randomly picks a maintainer + reviewer. Environment Automation uses it to spread the MR review workload. To do so:
- Go to the reviewer roulette page.
- Click on
Spin the wheel
.
See the full MR process.
Example responses
Here are some concrete examples of responses to capacity planning alerts.
- Removing a metric from capacity planning - Advanced search memory pressure does not follow long-term trends and was not a useful prediction. It remains a metric that is alerted on if it exceeds practical limits.
- Remove saturation metric entirely - kube_pool_cpu was incorrect in many cases, and difficult to get right. It needed to be replaced with a different saturation metric (node-based CPU).
- Add Saturation metrics - Kubernetes PVCs were not being monitored at all, leading to near-miss incidents
- Fix the saturation metric - Advanced search disk was inaccurate and needed to be replaced with better promql expressions
d6748148
)