Triage Operations

Automation and tooling for processing un-triaged issues at GitLab

Any GitLab team-member can triage issues. Keeping the number of un-triaged issues low is essential for maintainability, and is our collective responsibility.

We have implemented automation and tooling to handle this at scale and distribute the load to each team or group.

Video introduction to triage operations, triage report, priority and severity labels.

Accountability

The Quality Engineering Department ensures that every Product and Engineering group is held accountable to deliver on the SLA set forth.

Our defect SLA can be viewed at:

The Quality Engineering department employs a number of tools and automation in addition to manual intervention to help us achieve this goal. The work in this area can been seen in our department roadmap under Triage and Measure tracks of work.

Label renaming

There is a large amount of automation that uses stage, group, and category labels. We ask that Product Managers create an issue in triage-ops when any of the following changes occur. This issue helps ensure limited to no impact to automation and reports.

Auto-labelling of issues and merge requests

Our triage bot will automatically infer section, stage, and group labels based on the category/feature already set on an issue or MR. This is available for open issues/MRs within the gitlab-org group.

The most important rules are:

  • The bot doesn’t change a stage or group label if the stage or group is listed in stages.yml and the label is already set.
  • A group label is chosen only if the highest group match from its category labels is > 50%.
  • A group label is chosen only if it matches the already set stage label (if applicable).
  • A stage label is set based on the chosen or already set group label.
  • A section label is set based on the chosen or already set group or stage label.
  • The bot leaves a message that explains its inference logic.

The following logic was initially implemented in this merge request:

graph TB;
  A{Stage label<br>is present?} -- Yes --> B;
  B{Group label<br>is present?} -- Yes --> D;
  B -- No --> E;
  D{Group has<br>one category?} -- Yes --> X9[Set category label.];
  D -- No --> X1[Nothing to do.];
  E{Group is detected based on category labels<br>with a match rate > 50% among<br>all category labels?} -- Yes --> H;
  E -- No --> K;
  H{Does detected group label<br>matches stage label?} -- Yes --> X2[Set detected<br>group label.];
  H -- No --> K;
  K{Several potential groups in<br>current stage detected<br>from category labels?} -- Yes --> X3[Manual triage<br>required.];
  K -- No --> L;
  L{Does the stage has<br>a single group?} -- Yes --> X4[Set this<br>group label.];
  L -- No --> X5[Manual triage<br>required.];
  A -- No --> C;
  C{Group label<br>is present?} -- Yes --> F;
  F{Group has<br>one category?} -- Yes --> X10[Set stage and category labels<br>based on group label,<br>we're done!];
  F -- No --> X6[Set stage label<br>based on group label,<br>we're done!];
  C -- No --> G;
  G{Group is detected based on category labels<br>with a match rate > 50% among<br>all category labels?} -- Yes --> X7[Set group and<br>stage labels.];
  G -- No --> X8[Manual triage<br>required.];

After the above inference is done, a section label will be added based on the stage or group label. An explanation will not be added in this step if the inferred labels contain only a section label.

Check out the list of actual use-cases to better understand what this flow means in practice.

If your issue/MR doesn’t belong to a particular stage, you can remove the stage label and add the ~"automation:devops-mapping-disable" label to prevent this automation from happening in the future.

Triage reports

A triage report is an issue containing a checklist of issues or merge requests requiring attention. Usually, each task corresponds to an issue or a merge request that needs labels, prioritization, scheduling, attention etc. Some reports also include heatmaps or other various information.

Triage report are automatically assigned to specific team members, listed in the stages definition file.

To change who an issue gets assigned to, open a merge request for the above files. If the group definition file is changed, we’ll need to run some scripts to update the generated files as well.

These reports are owned by the Contributor Success team.

Newly created community merge requests

This report contains community merge requests requiring partial triage. The goal is for coaches to add type, stage, and group labels, so that the relevant people can be pinged later on based on these labels.

Community merge requests requiring attention

This report contains community merge requests that may require some attention from GitLab team members.

Team reports

Group level bugs, features, and Deferred UX

This report contains the relevant bugs, feature requests, and Deferred UX issues that belong to a group in our DevOps stages. The goal is to achieve complete-triage by the Product Manager, Engineering Manager, UX team member in that area.

The report itself is divided into 4 main parts.

  • Feature proposals
  • Deferred UX issues
  • Frontend bugs
  • Bugs (likely backend)
  • ~priority::1 and ~priority::2 bugs past the target SLO.

The bug sections also contains a heatmap.

heatmap.png

An example: https://gitlab.com/gitlab-org/quality/triage-ops/issues/118

Video overview of the triage report.

There is also an optional stage policy for missing categories. If your team has enabled this, you will receive a list of up to 100 items that have the stage label but have zero appropriate category labels for that stage.

Feature proposals

This section contains issues with the ~"type::feature" label without a milestone. It is divided further into issues with and without ~"customer"

  • Triage owner: Product Manager(s) for that group.
  • Triage actions:
    1. If the issue is a duplicate or irrelevant, close the issue out.
    2. Assign a milestone either to a versioned milestone, Backlog or Awaiting further demand milestone.
Frontend bugs

This section contains issues with the ~"type::bug" and ~"frontend" labels without priority and severity. It is divided further into issues with and without ~"customer"

  • Triage owner: Frontend Engineering Manager(s) for that group.
  • Triage actions:
    1. Close the issue if it is no longer relevant or a duplicate.
    2. Assign a Priority Label.
    3. Assign a Severity Label.
    4. Assign either a versioned milestone or to the Backlog.
Non-frontend bugs (likely backend)

This section contains issues with the ~"type::bug" label without priority and severity. It is divided further into issues with and without ~"customer"

  • Triage owner: Backend Engineering Manager(s) for that group.
  • Triage actions:
    1. Close the issue if it is no longer relevant or a duplicate.
    2. Assign a Priority Label.
    3. Assign a Severity Label.
    4. Assign either a versioned milestone or to the Backlog.
severity::1 & severity::2 bugs past SLO

This section contains bugs which has past our targeted SLO based on the severity label set. This is based on our missed SLO detection triage policy.

Heatmap for ~customer bugs

This section contains a table displaying the open issues for a group labelled with ~"customer" and ~"bug". There is a breakdown by the assigned severity and priority labels

Group level merge requests that may need attention

This report contains idle group merge requests authored by GitLab team members.

Merge requests are considered idle when they have no human activity for 28 days. This report collects them for prompting of any actions to move the MR forward, such as nudging the author, reviewer, or maintainer.

  • Triage owner: Engineering Manager(s) for that group.
  • Triage frequency: On 8th and 23rd every month.
  • Triage actions:
    1. Review these merge requests to identify if there are any steps that can shorten the time to merge. Steps can be:
      1. Reminding the author about it.
      2. Changing the DRI.

An example report: Merge requests requiring attention for group::access - 2020-11-08. Current reports can be found in the triage-reports project

Group level feature flags that may need attention

This report contains feature flags that have enabled in the codebase for 2 or more releases for groups within our DevOps stages.

The DRI is responsible for reviewing these feature flags to determine if they are able to be removed entirely, or create separate issues to ensure the overdue feature flags are removed accordingly.

  • Triage owner: Engineering Manager(s) for that group.
  • Triage frequency: On 1st of every month.
  • Triage actions:
    1. Review the feature flags to identify whether they can be:
      1. Removed by the Engineering DRI.
      2. Tracked with a separate issue for removal to be scheduled by the PM for the group.

An example report: Feature Flags requiring attention for group::continuous integration - 2021-03-01. Current reports can be found in the triage-reports project

The feature flag triage reports are generated in a quality toolbox scheduled pipeline with the gitlab-feature-flag-alert project.

Group level Bug Prioritization report

This report contains group level the Top 10 open issues of ~"type::bug" which needs to be prioritized for the upcoming milestone. It is divided further into issues with ~"severity::, ~"bug::vulnerability" and ~"customer" labels and listed based on the oldest age of the issues

An example report: 2023-11-01 - Bugs Prioritization for “group::source code” for upcoming milestone - 16.7. Current reports can be found in the triage-reports project

Auto closure of triage reports

Reports open for more than 2 weeks with the ~"triage report" label will be closed automatically with the close old triage reports automation.

Reactive workflow automation

Reactive triage automation is complementary to scheduled triage automation where realtime feedback provides an improved developer experience. This is handled by triage-ops.

Note: reactive command arguments between brackets ([]) are considered as optional.

Following is a diagram that shows how all the automations fit together:

graph LR
    classDef triageOpsClass fill:#FC6D26,stroke:#333,stroke-width:3px;

    MR_INITIAL(["Wider Community Merge request<br />(author is not a member of `gitlab-org`)"])
    MR_COMMUNITY(["Merge request with the `Community contribution` label"])
    MR_OPENED[MR is opened]
    MR_UPDATED[MR is updated]
    MR_MERGED[MR is merged]
    MR_CLOSED[MR is closed]
    MR_AUTHOR_NOTE[MR author posts a note]
    ANYONE_NOTE[Anyone posts a note]
    AUTOMATED_THANK(["1. Post a 'Thank you' note<br/>2. Add the `Community contribution` label<br />3. Add the `workflow::in dev` label<br />4. Assign MR to its author"])
    WORKFLOW_READY_FOR_REVIEW_LABEL{"Was the<br />`workflow::ready for review`<br />label added?"}
    AUTOMATED_REVIEWER_REQUEST_GENERIC(["If reviewers are present, ask them to review.<br />Otherwise, ask (and assign) an MR coach<br />(selected based on group label) to review"])
    AUTOMATED_REVIEW_DOC{"Does the MR touches<br/>documentation files?"}
    AUTOMATED_REVIEWER_REQUEST_DOC(["Post a note asking a<br />technical writer to review"])
    AUTOMATED_REVIEW_UX{"Does the MR has<br />the `UX` label?"}
    AUTOMATED_REVIEWER_REQUEST_UX(["Post a message in the<br />`#ux-community-contributions`<br />Slack channel, and on the MR"])
    AUTOMATED_FEEDBACK_REQUEST(["Post a note asking<br />for feedback"])
    AUTOMATED_HACKATHON_LABEL{Is a Hackathon<br />currently running?}
    AUTOMATED_HACKATHON_LABEL_ADDITION(["Add the `Hackathon` label"])
    WHAT_AUTHOR_NOTE{What note is it?}
    WHAT_ANYONE_NOTE{What note is it?}

    AUTOMATED_LABEL_COMMAND_REPLY(["Add the requested label"])
    AUTOMATED_HELP_COMMAND_REPLY(["Ask (and assign as reviewer)<br />an MR coach for help"])
    AUTOMATED_REVIEW_COMMAND_REPLY(["Add the `workflow::ready for review` label"])
    AUTOMATED_FEEDBACK_COMMAND_REPLY(["Post the feedback in the<br />`#mr-feedback` Slack channel"])

    MR_INITIAL -.-> MR_OPENED
    MR_COMMUNITY -.-> MR_UPDATED & MR_MERGED & MR_CLOSED & MR_AUTHOR_NOTE & ANYONE_NOTE

    MR_OPENED ----> AUTOMATED_THANK
    MR_UPDATED -.-> WORKFLOW_READY_FOR_REVIEW_LABEL
    MR_UPDATED -.-> AUTOMATED_HACKATHON_LABEL
    MR_MERGED & MR_CLOSED ----> AUTOMATED_FEEDBACK_REQUEST
    MR_AUTHOR_NOTE -.-> WHAT_AUTHOR_NOTE
    ANYONE_NOTE -.-> WHAT_ANYONE_NOTE

    WORKFLOW_READY_FOR_REVIEW_LABEL ---> |Yes| AUTOMATED_REVIEWER_REQUEST_GENERIC
    WORKFLOW_READY_FOR_REVIEW_LABEL -.-> |Yes| AUTOMATED_REVIEW_DOC & AUTOMATED_REVIEW_UX
    AUTOMATED_REVIEW_DOC -->|Yes| AUTOMATED_REVIEWER_REQUEST_DOC
    AUTOMATED_REVIEW_UX -->|Yes| AUTOMATED_REVIEWER_REQUEST_UX
    AUTOMATED_HACKATHON_LABEL --->|Yes| AUTOMATED_HACKATHON_LABEL_ADDITION

    WHAT_AUTHOR_NOTE --->|"@gitlab-bot label ..."| AUTOMATED_LABEL_COMMAND_REPLY
    WHAT_AUTHOR_NOTE --->|"@gitlab-bot feedback"| AUTOMATED_FEEDBACK_COMMAND_REPLY

    WHAT_ANYONE_NOTE --->|"@gitlab-bot help"| AUTOMATED_HELP_COMMAND_REPLY
    WHAT_ANYONE_NOTE --->|"@gitlab-bot ready"| AUTOMATED_REVIEW_COMMAND_REPLY

    class AUTOMATED_THANK,AUTOMATED_LABEL_COMMAND_REPLY,AUTOMATED_HELP_COMMAND_REPLY triageOpsClass;
    class AUTOMATED_REVIEW_COMMAND_REPLY,AUTOMATED_FEEDBACK_REQUEST,AUTOMATED_REVIEW_DOC triageOpsClass;
    class AUTOMATED_REVIEW_UX,AUTOMATED_REVIEWER_REQUEST_DOC,AUTOMATED_REVIEWER_REQUEST_UX triageOpsClass;
    class AUTOMATED_FEEDBACK_COMMAND_REPLY,AUTOMATED_HACKATHON_LABEL triageOpsClass;
    class AUTOMATED_HACKATHON_LABEL_ADDITION,WHAT_AUTHOR_NOTE,WHAT_ANYONE_NOTE triageOpsClass;
    class WORKFLOW_READY_FOR_REVIEW_LABEL,AUTOMATED_REVIEWER_REQUEST_GENERIC triageOpsClass;

Community contribution thank you note

Automated review request

Automated review request for doc contributions

Automated review request for UX contributions

Reactive help command

Reactive ready command

  • Automation conditions:
    • A new MR note that starts with @gitlab-bot ready [@user1 @user2 ...], @gitlab-bot review [@user1 @user2 ...], or @gitlab-bot request_review [@user1 @user2 ...]
    • The note is posted by the MR author or a team member
  • Automation actions:
    • Adds the workflow::ready for review label to the MR
    • Assigns the provided users (any GitLab community member) as reviewers, otherwise picks a random MR coach as reviewer
  • Rate limiting: once per author/MR per hour, or 100 times per team member/MR per hour
  • Processor: https://gitlab.com/gitlab-org/quality/triage-ops/-/blob/master/triage/processor/community/command_mr_request_review.rb

Reactive unassign_review command

Reactive label and unlabel commands

  • Automation conditions:
    • A new note that starts with @gitlab-bot label ~"label-name" or @gitlab-bot unlabel ~"label-name" where label-name matches:
      • group::*, type::*, feature::*, bug::*, maintenance::*, category:*
      • backend, database, documentation, frontend, handbook, UX
      • security (label only for community members)
      • workflow::in dev, workflow::ready for review, workflow::blocked
    • The note is posted by the author, an assignee, or a team member
  • Note: to add or remove multiple labels, list all labels after the command, for example: @gitlab-bot label ~"group::project management" ~"type::bug"
  • Automation actions:
    • Adds or removes the requested label
  • Rate limiting: 60 times per requester/item per hour
  • Processor: https://gitlab.com/gitlab-org/quality/triage-ops/-/blob/master/triage/processor/community/command_mr_label.rb

Idle/Stale label remover

Code Review Experience Feedback

Reactive feedback command

Leading Organizations labeler

Hackathon labeler

Spam detector

Engineering workflow automation

Ensure priorities for availability issues

For issues labelled ~"availability", the minimal are enforced with the guidelines at https://about.gitlab.com/handbook/engineering/infrastructure/engineering-productivity/issue-triage/#availability-prioritization

Ensure no deprecated backstage labels are added

Whenever ~"backstage [DEPRECATED]" is added, it’ll remove it and hint about why it should not be added, and alternatives will be provided.

The ~"customer" label is applied when a customer associated link is applied.

The following URLs are considered customer associated links:

  • gitlab.zendesk.com
  • gitlab.my.salesforce.com

Add type label from subtype

Whenever a subtype label is added, the corresponding type label is added. Current type labels with subtype labels are:

  • ~"type::feature"
  • ~"type::tooling"

Reactive retry_job command

Reactive retry_pipeline command

Database Review Experience Feedback

Scheduled workflow automation

Scheduled triage automation is run to label and update issues which help with reporting and milestone transition. This is handled by triage-ops.

Remove Seeking community contributions from issues with an assignee

When an issue is assigned, it shouldn’t accept any new contribution to prevent duplicated work.

Remove Seeking community contributions from issues with an invalid workflow label

When an issue has the Seeking community contributions label set, but also an incompatible workflow label, the issue isn’t actually ready to accept a contribution.

Remove Seeking community contributions from all merge requests

It doesn’t make sense to have Seeking community contributions set on merge requests.

Label community contributions

Merge requests which have an author that is not a member of gitlab-org will have the Community contribution label applied. This scheduled automation is a backup for the reactive automation that applies Community contribution in the welcome message.

Add milestone to community merge requests

Merged merge requests with the Community contribution label and no milestone will automatically get the relevant milestone set. This helps keep the community contributions numbers accurate.

Label idle community merge requests

Label stale community merge requests

Nudge EMs on community merge requests that are stale

Nudge relevant coach on community merge requests that are waiting for a review

Nudge assigned reviewers on community merge requests that are waiting for a review

Engineering workflow automation

Milestone reschedule

Open issues and merge requests that have missed the current release will be rescheduled to the next active milestone. This identifies pending work that was not completed within the planned milestone.

Note: Confidential issues will be skipped as part of the missed label application. Please see the this issue for more information

Missed deliverable

Open issues and merge requests planned as ~Deliverable but have a ~missed:x.y label will have the ~missed-deliverable label applied.

Note: Confidential issues will be skipped as part of the missed label application. Please see the this issue for more information

Deliverable with no milestone

Issues which have a label of ~Deliverable without a milestone will have the milestone set to %Backlog.

Missed SLO

Issues which have a severity label and missed the SLO target will be labeled with ~missed-SLO. The calculation for elapsed time starts from the date of the severity label was applied. This enables reporting on SLO target adherence.

Bug priority label inference

Bugs which have a severity 1 or severity 2 label without a priority label will be labeled with the equal priority label. For example, a ~severity::1 ~"type::bug" without a priority label will have ~priority::1 applied.

Master broken categorization

Issues or merge requests that have a label of ~"master:broken" will have labels of ~"priority::1" and ~"severity::1" applied. This ensures that requests which break master are sufficiently categorized for reporting.

Identify interesting feature proposals

This automation identifies potential and popular proposals using upvotes. This helps identify feature proposals that people have indicated they would like.

Auto-close inactive bugs

GitLab values the time spent by contributors on reporting bugs. However, if a bug remains inactive for a very long period, it will qualify for auto-closure. The following is the policy for identification and auto-closure of inactive bugs.

  • If a ~"severity::3" or ~"severity::4" ~"type::bug" issue is inactive for at least 12 months, it will be identified as eligible for auto-closure. At this point, the following actions occur:
    • Application of ~"vintage" to indicate the issue has been inactive for a year.
    • Application of ~"stale" to indicate that it is currently being identified for auto-closure.
    • Comment by GitLab Bot to the author to check whether the reported bug still persists and to comment accordingly within the next 7 days.
  • After 7 days, one of the below mentioned actions happen:
    • Issues which have not received a comment will be closed and the ~"auto-closed" is applied.
    • Issues with a comment from anyone other than the gitlab-bot in the last 7 days are considered active and ~"stale" is removed
  • Policy: https://gitlab.com/gitlab-org/quality/triage-ops/-/blob/master/policies/stages/hygiene/close-stale-bugs.yml

Prompt for Tier labels on issues

Tier labels should be applied to issues to specify the license tier of feature. This policy prompts the Product Manager for the applied group label to add the license tier label to issues that are scheduled for the current milestone and labelled with ~direction.

The possible tier labels to be applied are:

  • ~“GitLab Free”
  • ~“GitLab Premium”
  • ~“GitLab Ultimate”

Prompt for Type labels on issues

Type labels are applied to issues to increase the visibility and discoverability during team issue refinement. This policy applies to gitlab-org team member created issues and prompts the author to apply a type label to the issue within the first week.

Type labels ensure that issues are present in the group triage report and added to the correct section.

Data

Bug SLO Warning

Bugs have a severity label that indicates the SLO for a fix. This automated policy aims to prompt managers about bugs in their group that are approaching the SLO threshold

Reminder on ~infradev issues to set severity label, priority label, and milestone

Issues with the ~infradev label should have a severity label, a priority label, and a milestone set. This automated policy aims to prompt managers about such issues missing one of these attributes.

Note:

  1. The ~"automation:infradev-missing-labels" is automatically removed when a severity label, a priority label, and a milestone are set on the issue.
  2. The ~"automation:infradev-missing-labels" is automatically removed after two weeks, leading to a new message being posted if the Automation Conditions above are still met. This effectively ensures that a reminder is posted on the issue every two weeks.

Resources


Issue Triage Onboarding
Onboarding guidelines for Issue Triaging team
Last modified March 19, 2024: Rename UX Debt label to Deferred UX (9fb8f52d)