Engineering Productivity team
The Engineering Productivity team maximizes the value and throughput of Product Development teams and wider community contributors by improving the developer experience, streamlining the product development processes, and keeping projects secure, compliant, and easy to work on for everyone.
Mission
- Constantly improve efficiency for our entire engineering and product teams to increase customer value.
- Measure what matters: quality of life, efficiency, and toil reduction improvements with quantitative and qualitative measures.
- Build partnerships across organizational boundaries to deliver maintainability and efficiency improvements for all stakeholders.
Vision
The Engineering Productivity team’s vision is to focus on the satisfaction of the Product Development teams and wider community contributors while keeping GitLab projects
secure, compliant, and easy to work on.
Integral parts of this vision:
- Developer experience: Provide stable development environments and tools, as well as a consistent and streamlined contributing experience.
- Product development processes: Help product and engineering managers see the whole picture about their group’s bugs, feature proposals, planned and started work,
as well as automate issues and merge requests hygiene (labels, milestones, staleness etc.).
- Maintainability and security of GitLab’s projects: Enforce configuration consistency (project settings, CI/CD pipelines) for all GitLab projects
–including JiHu– to ensure they’re maintainable, compliant and secure in the long-term.
Our principles
- See it and find it: Build automated measurements and dashboards to gain insights into the productivity of the Engineering organization to identify opportunities for improvement.
- Implement new measurements to provide visibility into improvement opportunities.
- Collaborate with other Engineering teams to provide visualizations for measurement objectives.
- Improve existing performance indicators.
- Do it for any contributor: Increase contributor productivity by making measurement-driven improvements to the development tools / workflow / processes, then monitor the results, and iterate.
- Dogfood use: Dogfood GitLab product features to improve developer workflow and provide feedback to product teams.
- Use new features from related product groups (Analytics, Monitor, Testing).
- Improve usage of Review apps for GitLab development and testing.
- Engineering support:
- Engineering workflow: Develop automated processes for improving label classification hygiene in support of product and Engineering workflows.
- Dogfood build: Enhance and add new features to the GitLab product to ultimately improve productivity and efficiency of GitLab customers.
Areas of responsibilities
graph LR
A[Engineering Productivity Team]
A --> C[Developer Experience]
C --> C1[GitLab Development Kit<br>Providing a reliable development environment]
click C1 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/31"
C --> C2[GitLab Remote Development<br>Providing a remote reliable development environment]
click C2 "https://gitlab.com/groups/gitlab-org/-/epics/11799"
C --> C3[Merge Request Review Process<br>Ensuring a smooth, fast and reliable review process]
click C3 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/34"
C --> C4[Merge Request Pipelines<br>Providing fast and reliable pipelines]
click C4 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/28"
C --> C5[Automated main branch failing pipelines management<br>Providing a stable `master` branch]
click C5 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/30"
C --> C6[Review apps<br>Providing review apps to explore a merge request changes]
click C6 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/33"
A --> B[Product development processes]
B --> B1[Weekly team reports<br>Providing teams with an overview of their current, planned & unplanned work]
click B1 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/32"
B --> B2[Issues & MRs hygiene automation<br>Ensuring healthy issue/MR trackers]
click B2 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/32"
B --> B3[Metrics and dashboards<br>Making data-driven decisions]
click B3 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/36"
A --> D[Maintainability and security of GitLab's projects]
D --> D1[Automated dependency updates<br>Ensuring dependencies are up-to-date]
click D1 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/40"
D --> D2[Automated management of CI/CD secrets<br>Providing a secure CI/CD environment]
click D2 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/46"
D --> D3[Static analysis<br>Ensuring the codebase style and quality is consistent and reducing bikeshedding]
click D3 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/38"
D --> D4[Shared CI/CD components<br>Providing CI/CD components to ensure consistency in all GitLab projects]
click D4 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/41"
D --> D5[Support of the JiHu development team]
click D5 "https://gitlab.com/groups/gitlab-org/quality/engineering-productivity/-/epics/35"
Team structure
Members
Stable counterpart
Metrics
KPIs
Infrastructure Performance Indicators are our single source of truth
PIs
SPACE
Shared
Dashboards
The Engineering Productivity team creates metrics in the following sources to aid in operational reporting.
OKRs
Objectives and Key Results (OKRs) help align our sub-department towards what really matters. These happen quarterly and are based on company OKRs. We follow the OKR process defined here.
Here is an overview of our current OKRs.
Communication
Office hours
Engineering productivity has monthly office hours on the 3rd Wednesday of the month at 3:00 UTC (20:00 PST) on even months (e.g February, April, etc) open for anyone to add topics or questions to the agenda. Office hours can be found in the GitLab Team Meetings calendar
Meetings
Engineering Productivity has weekly team meeting on Wednesdays 15:00 UTC, 08:00 PST.
Communication guidelines
The Engineering Productivity team will make changes which can create notification spikes or new behavior for
GitLab contributors. The team will follow these guidelines in the spirit of GitLab’s Internal Communication Guidelines.
Pipeline changes
Critical pipeline changes
Pipeline changes that have the potential to have an impact on the GitLab.com infrastructure should follow the Change Management process.
Non-critical pipeline changes
The team will communicate significant pipeline changes to #development
in Slack and the Engineering Week in Review.
Pipeline changes that meet the following criteria will be communicated:
- addition, removal, renaming, parallelization of jobs
- changes to the conditions to run jobs
- changes to pipeline DAG structure
Other pipeline changes will be communicated based on the team’s discretion.
Automated triage policies
Be sure to give a heads-up to #development
, #eng-managers
, #product
, #ux
Slack channels
and the Engineering week in review when an automation is expected to triage more
than 50 notifications or change policies that a large stakeholder group use (e.g. team-triage report).
Experiments
This is a list of Engineering Productivity experiments where we identify an opportunity, form a hypothesis and experiment to test the hypothesis.
Last reviewed: 2021-01-16
-
GDK Project
-
Issue List
-
Epic List
-
Please comment, thumbs-up (or down!), and contribute to the linked issues and
epics on this category page. Sharing your feedback directly on GitLab.com is
the best way to contribute to our vision.
-
Please share feedback directly via email,
Twitter. There’s also a Discord #contribute channel you can give us feedback and ask questions in.
-
If you’re a GDK user, we’d always love to hear from you!
Guidelines for project management for the Engineering Productivity team at GitLab
Introduction
A flaky test is an unreliable test that occasionally fails but passes eventually if you retry it enough times.
In a test suite, flaky tests are inevitable, so our goal should be to limit their negative impact as soon as possible.
Out of all the factors that affects master pipeline stability, flaky tests contribute to at least 30% of master pipeline failures each month.
Current state and assumptions
Current state |
Assumptions |
master success rate was at 89% for March 2024 |
We don’t know exactly what would be the success rate without any flaky tests, but we assume we could attain 99% |
5200+ ~"failure::flaky-test" issues out of a total of 260,040 tests as of 2024-03-01 |
It means we identified 1.99% of tests as being flaky. GitHub identified that 25% of their tests were flaky at some point, our reality is probably in between. |
Coverage is currently at 98.42% |
Even if we’d removed the 5200 flaky tests, we don’t expect the coverage to go down meaningfully. |
“Average Retry Count” per pipeline is currently at 0.015, it means given RSpec jobs’ current average duration of 23 minutes, this results in an additional 0.015 * 23 = 0.345 minutes on average per pipeline, not including the idle time between the job failing and the time it is retried. Explanation provided by Albert. |
Given we have approximately 91k pipelines per month, that means flaky tests are wasting 31,395 CI minutes per month. Given our private runners cost us $0.0845 / minute, this means flaky tests are wasting at minimum $2,653 per month of CI minutes. This doesn’t take in account the engineers’ time wasted. |
Manual flow to detect flaky tests
When a flaky test fails in an MR, the author might follow the following flow:
Guidelines for triaging new issues opened on GitLab.com projects
Satisfaction #
Activity #
Collaboration #
Efficiency #
Metrics for satisfaction #
Quarterly Engineering satisfaction survey NPS score #
Score taken from the Quarterly Engineering satisfaction survey
Introduction
As the owner of pipeline configuration for the GitLab project, the Engineering Productivity team has adopted several test intelligence strategies aimed to improve pipeline efficiency with the following benefits:
- Shortened feedback loop by prioritizing tests that are most likely to fail
- Faster pipelines to scale better when Merge Train is enabled
These strategies include:
- Predictive test jobs via test mapping
- Fail-fast job
- Re-run previously failed tests early
- Selective jobs via pipeline rules
- Selective jobs via labels
Predictive test jobs via test mapping
Tests that provide coverage to the code changes in each merge request are most likely to fail. As a result, merge request pipelines for the GitLab project run only the predictive set of tests by default. These include:
Automation and tooling for processing un-triaged issues at GitLab
Guidelines for triaging new merge requests from the wider community opened on GitLab.com projects
The Engineering Productivity team owns the tooling and processes for GitLab’s internal workflow automation. Triage-ops is one of the main projects the EP team maintains, which empowers GitLab team members to triage issues, MRs and epics automatically.
One-off label migrations
In the event of team structure changes, we often need to run a one-off label migration to update labels on existing issues, MRs and epics. We encourage every team member to perform the migrations themselves for maximum efficiency. For the fastest result, please follow these instructions below to get started on a label migration merge request. The EP team can then help review and run the migrations if needed.