Root Cause Analysis

At GitLab transparency is one of our core values, as it helps create an open and honest working environment and service, which in turn accelerates growth and innovation. We treat a root cause analysis (RCA) as an opportunity to be transparent amongst our organization and community by investigating what went well and what didn’t after working on a project, incident, or issue. This page defines an RCA, the benefits of completing them, and how to complete a successful RCA here at GitLab.

Any GitLab team-member can perform an RCA on issues they’re responsible for, as there is no wrong time to learn from our mistakes and successes.

What is a Root Cause Analysis?

A Root Cause Analysis (RCA) is the process of finding the source of failures and accomplishments after completing a project. While RCAs are common after incidents, they do not only fit into this model of incident management. An RCA can be done after any project; whether it was technical or non-technical. While a RCA can be done in any manner, there is a template that has consolidated input from multiple teams to develop an issue template. As well as Engineering, Customer Success provides a great overview of their RCA process.

Your process may be different - but the RCA template is a great starting point.

How are RCAs beneficial?

RCAs are an opportunity to learn from what went well and what didn’t during our workflows; it is not, however, used to promote blame or point fingers. These are blameless reviews of the workflows and processes taken during a project to enhance learning from our experiences and more dynamic iteration in our processes. Issues can be worked on individually or in a team, and often involve cross-team collaboration. RCAs allow everyone to learn from the mistakes and successes - no matter the involvement (or-lack-thereof) in the issue - across the entire organization. GitLab utilizes RCAs to consolidate project information which further improves their utility as a single source-of-truth for a project after its completion.

While each team is unique in their function, the ability to learn from our past performances across the GitLab organization allows us to transform our approach to solving problems, iterate our processes based on data, and help prevent us from making the same mistakes over and over again.

How to perform an RCA

While RCAs can take any shape or form, we have tried to consolidate the process here at GitLab so that anyone reviewing an RCA will easily be able to benefit and access the information required in an easy manner. Different teams may require different sections in an RCA - but overall, please try to keep the RCA format similar enough that cross-team review of the RCA is not hindered.

To perform an RCA simply leverage the template provided below and open an issue in your teams issue tracker (you can use the template to make your own issue template in your teams tracker as well). By going step-by-step in the issue, answering all of the questions, and collecting all of the information, we aim to maximize RCAs effectiveness and make the process as repeatable as possible.

Please reference the RCA template for a step-by-step overview of questions to answer in an RCA.

Unplanned upgrade stop has a special template for conducting Root Cause Analysis. Please refer to Unplanned upgrade stop page to learn more.

For a more in-depth overview of an established RCA process, please review this handbook page.

Communicate your findings

Root cause analysis findings are useful beyond those participating in the project and analysis. Sharing these findings can help others learn from the experience of others. Utilize multimodal communication to share your RCA findings including:

Notes

The following links expand upon RCAs, what they are, how to perform them, and their value:

Last modified July 9, 2024: Fix links and spelling (e30f31b6)