Software Supply Chain Security, Threat Insights
Customer outcomes we are driving for at GitLab
As a developer it is imperative to know if you are introducing vulnerabilities as you merge into protected branches in addition to the default branch. In FY25, we will allow users to track vulnerabilities across multiple branches. If there is something a developer wants to remediate, but aren’t sure where to get started, they can use our features AI to learn more and get a suggestion for a fix.
As a security engineer, you want to know what vulnerabilities to work on first. Over the next year we will be adding key risk metrics so you can quickly triage and mitigate vulnerabilities that have the potential to be exploited.
Leadership wants to make sure their organization is mitigating risks and their security programs are effective. With enhancements to the Security Dashboards, leaders will have a place to get an overview and answer key questions about metrics, trends and vulnerabilities that need to be addressed quickly.
Top Priorities for FY25
Enable users to identify risk and visualize trends - We will be making enhancements to Security Dashboards at the project and group level.
Estimate potential impact and likelihood of vulnerability exploitation - Give users the ability to access risk directly in the vulnerability report through industry known risk scores like CVSS (Common Vulnerability Scoring System) and exploitability probability.
Enable users to track vulnerabilities across multiple branches - Allow users to track vulnerabilities outside the default branch.
Offer guidance for users to get started with vulnerability remediation - leverage the power of AI and security training to help developers understand and remediate vulnerabilities.
Threat Insights features are reliable and perform at scale - As we add more group and organization level features, we will be optimizing query performance and move forward with confidence that our database will scale and perform as we grow.
Threat Insights Team Structure
The Threat Insights group is structured into three focused swimlanes that each approach work in vertical slices: Performance and Optimization, Projects, and AI. This subdivision is to provided bounded focus to each area: enabling us to progress on multiple fronts and reduce planning overhead.
Stable Counterparts
The following members of other functional teams are our stable counterparts, and work across all swimlanes:
Name | Role |
---|---|
Becka Lippert | Senior Product Designer, Govern: Threat Insights |
Ottilia Westerlund | Security Engineer, Fulfillment (Fulfillment Platform, Subscription Management), Security Risk Management (Security Policies, Threat Insights), Monitor (Observability), Plan (Product Planning), AI-powered (Duo Chat, Duo Workflow, AI Framework, AI model validation, Custom models) |
Performance and Optimization
DRI: Neil McCorrison
Projects
DRI: Ryan Wells
AI
DRI: Neil McCorrison
Reporting Structure
Threat Insights was previously sub-divided into Navy and Tangerine, following the reporting lines below. Navy engineers report to Neil McCorrison and Tangerine engineers report to Ryan Wells .
Name | Role |
---|---|
Neil McCorrison | Engineering Manager, Security Risk Management:Security Insights |
Name | Role |
---|---|
Ryan Wells | Engineering Manager, Security Risk Management:Security Infrastructure |
Common Links
- Slack channels:
- Main channel:
#g_govern_threat_insights
- Stand-up updates:
#g_govern_threat-insights_standup
- Engineering - All:
#s_srm_security_eng
- Engineering - Team AI:
#g_govern_threat_insights_eng_ai
- Engineering - Team Navy:
#g_govern_threat_insights_performance
- Engineering - Team Tangerine:
#g_govern_threat_insights_projects
- Main channel:
- Slack aliases:
@govern_threat_insights_be
,@govern_threat_insights_fe
- Google groups: eng-dev-secure-threat-insights-members@gitlab.com
- Threat Insights calendar (internal link)
Prioritization
We use our Threat Insights Priorities page for 17.x to track what we are doing, and what order to do it in.
Metrics
Workflow
The Threat Insights group largely follows GitLab’s Product Development Flow.
Additional information can be found on the Planning page.
Milestone Planning
- On the second Tuesday of the month the Product Manager kicks off the planning issue. They identify priorities for the milestone and tag engineering managers, and stable counterparts (UX, QA) to review.
- By the third Tuesday of the month the Engineering Managers have reviewed the planning issue and agreed on the scope for the milestone.
- All epics scheduled for this milestone should have the
~auto-report
label and one of these labels:~Threat Insights::Performance
~Threat Insights::Projects
- All issues scheduled for the milestone should have the
~Deliverable
label as well asHealth Status: On Track
at the beginning of the milestone. The milestone field should also be set correctly.
- All epics scheduled for this milestone should have the
- The planning issue is created in this epic for 17.0-17.11.
Tracking Deliverables
- Issues that are marked as Deliverables for a milestone serve as the single source of truth for what we aimed to deliver for a given milestone. Throughout the milestone, things may change, become blocked, etc. Ideally, we’d like to keep the Planning Issue unchanged after the milestone starts.
- Something is considered delivered if it is either a. merged into production in time for the release date, b. completed before the next milestone start, or c. the feature flag enabling the feature is turned on. It is important to keep track of the milestone of the deliverable; we encourage self-managed customers to turn on feature flags so they can try different features. Ensuring the milestone is correct, allows someone to tell if that change is available in a specific release.
Weekly async issue updates
At the end of every week, each engineer is expected to provide a quick async issue update by commenting on their assigned issues using the following template:
### Async issue update
* Current status:
<!--- Please provide a quick summary of the current status (one sentence) -->
* Shipping this milestone: <!-- Not confident | Slightly confident | Very confident -->
* Scope reduction opportunities: <!-- No | Yes, ... -->
/health_status <!-- on_track | needs_attention | at_risk -->
/label <!-- ~"workflow::blocked" | ~"workflow::refinement" | ~"workflow::ready for development" | ~"workflow::In dev" | ~"workflow::In review" | ~"workflow::verification" -->
<!-- Please apply a :triangular_flag_on_post: emoji to this comment. Fore more information see https://gitlab.com/jayswain/automated-reporting -->
We do this to encourage our team to be more async in collaboration and to allow the community and other team members to know the progress of issues that we are actively working on. This also enables us to automatically collate updates across swimlanes, removing some manual process.
Support rotation
On top of our development roadmap, engineering teams need to perform tasks related to support and triage. Our team nominates an individual person to reserve capacity for these tasks. The rota is here (internal link) This is to avoid excessive context-switching and better distribute the workload. It is important we defend our focus within the team to support the delivery of our commitments.
If you are not the nominated person in a given week then:
- You are not expected to triage and investigate by default. Use your best judgement here (e.g. critical issues still take priority, no change in expectations here).
- You should redirect the question to the nominated person (e.g. if it comes in a DM in Slack, redirect it to our public channel).
Please keep track of the actions you’re doing during your rotation and add notes in the corresponding issue (e.g. copying tools command executed locally, sharing relevant changes to projects and processes, etc.)
Triage expectations
Triage does not immediately guarantee a change to currently-planned work in a milestone. Triage is the process of determining impact and priority so we can justify changes to scope and milestone commitments.
- Refine the request for help tickets: do we have reproduction steps, does this relate to other scoped or planned work, is this a bug or feature request or an acceptable limitation of the system.
- Outcomes could be: updates to our documentation or Handbook pages, validated reproduction of bugs and then creating issues from this.
- Directly answering support questions.
- Engaging with Product to agree on priority and scheduling of any work required. Work with Product to define severity and whether to interrupt the rest of the development team.
When dealing with Slack interactions you are expected to use the following reactions:
- 👀 - I am actively looking at this
- ✅ (or a variant) - This is resolved
Responsibilities - Support
- Monitor slack channels for questions, support requests, and alerts. The person assigned to the reaction rotation is expected to handle them primarily. If a support engineer requests assistance via Slack and it requires investigation or debugging, they should be directed to raise an issue in a dedicated project.
- #g_govern_threat_insights
- s_software-supply-chain-security
- #sec-section
- #s_secure-alerts
- #sec-eng-requests-for-help
- Monitor Section Sec Request For Help project for support requests.
Our preference is to utilise the Section Sec Request For Help as much as possible. This helps with visibility, tracking and review.
These items must be triaged continuously which means they must be checked multiple times a week.
MR Reviews
We follow these guidelines when submitting MRs for review when the change is within the Threat Insights domain:
- Aim to request at least one of the reviews from someone outside our group. This helps avoid a code knowledge silo.
- For time-critical reviews, consider using internal reviewers and maintainers.
- The initial review should be performed by a member of the team. This helps the team by:
- Faster reviews, as the reviewer is already familiar with the domain.
- Additional awareness of changes taking place within the domain.
- Identifying changes that don’t align with what is happening with the domain.
- Providing additional confidence from a domain expert to the external maintainer reviewer that the change behaves as expected.
- GraphQL merge requests should be reviewed by a frontend engineer as soon as possible. This helps to validate the interface, and allows changes to be made before tests are written.
Issue Boards
-
Threat Insights Delivery Board
- Primary board for engineers from which engineers can work. It’s stripped down to only include the workflow labels we use when delivering software.
-
Threat Insights Planning Board
- Milestone-centric board primarily used by product management to gauge work in current and upcoming milestones.
-
Threat Insights “Ready to Pull” Board
- Secondary board for unassigned issues that are separate from a larger effort. Ideal candidates are small features, bugs, and follow-up items.
These boards show current status of issues.
Indicating Status and Raising Risk
Our teams use the Health Status feature within issues to indicate the likelihood of completion within the milestone. We assign On Track
at the beginning of a milestone to a small number of issues where we have high confidence in delivery during that milestone. If there is concern with marking something as initially on track, then we should discuss why.
Raising risk early is important. The more time we have, the more options we have. For example, issues that have not gone into review by the 10th of the month may not have enough time to get merged. These should be considered Needs Attention or At Risk depending on their complexity and other factors.
Follow these steps when raising or downgrading risk:
- Update the Health Status in the issue:
On Track
- high confidence - there is no indication the work won’t get merged by the 15th.Needs Attention
- medium confidence - the issue is blocked or has other factors that need to be discussed.At Risk
- low confidence - the issue is in jeopardy of missing the merge cutoff of the 15th.
- Add a comment about why the risk has increased or decreased. Copy the Engineering Manager and Project Manager for awareness.
Note that an issue probably shouldn’t go directly from On Track to At Risk. That pattern indicates we have missed an opportunity to discuss earlier. Consider the progression: On Track -> Needs Attention -> At Risk
.
Quality
Running E2E specs in the MR pipeline
We encourage running the e2e: test-on-omnibus
downstream E2E job in merge requests at least once and review the results when there are changes in:
- GraphQL (API response, query parameters, schema etc)
- Gemfile (version changes, adding/removing gems)
- Database schema/query changes
- Any frontend changes which directly impact vulnerability report page, MR security widget, pipeline security tab, security policies, configuration, license compliance page
Running Govern E2E specs locally against GDK
Standalone E2E specs can be run against your local GDK instance.
E2E tests with feature flags
E2E tests should pass with a feature flag enabled before it is enabled on Staging or on GitLab.com.
Therefore, it’s important to confirm this when introducing a new feature flag. Adding or editing a feature flag definition file starts two e2e:test-on-omnibus
jobs (one with the feature flag turned on and another where it’s turned off).
Monitoring
- Stage Group dashboard on Grafana
- Largest Contentful Paint (LCP) for our web pages.
Contributing
Local testing of licensed features
When a feature needs to check the current license tier, it’s important to make sure this also works on GitLab.com.
To emulate this locally, follow these steps:
- Export an environment variable:
export GITLAB_SIMULATE_SAAS=1
1 - Within the same shell session run
gdk restart
- Admin > Settings > General > “Account and limit”, enable “Allow use of licensed EE features”
See the related handbook entry for more details.
Cross-stack collaboration
We encourage frontend engineers to contribute to the backend and vice versa. In such cases we should work closely with a domain expert from within our group and also keep the initial review internal.
This will help ensure that the changes follow best practice, are well tested, have no unintended side effects, and help the team be across any changes that go into the Threat Insights codebase.
Community Contributions
The Threat Insights group welcomes community contributions. Any community contribution should get prompt feedback from one of the Threat Insights engineers. All engineers on the team are responsible for working with community contributions. If a team member does not have time to review a community contribution, please tag the Engineering Manager, so that they can assign the community contribution to another team member.
If a team member creates an issue or finds an issue where we would be open to a community contribution, it should be labeled with ~“Seeking community contributions”. If the contributor needs an EE license, we can point towards the Contributing to the GitLab Enterprise Edition (EE) section on the Community contributors workflows page.
Group discussion
We hold weekly group discussions alternating on APAC/AMER, and EMEA/AMER time zones. Everyone is invited to attend, and it’s a great forum to ask questions about Vulnerability Management, customer queries, our road map, and what the Threat Insights team might be thinking about. You can find the meetings on the Threat Insights calendar; take a look at the agenda (internal link). We hope to see you there!
Footnotes
-
There are many ways to pass an environment variable to your local GitLab instance. For example, you can create a
env.runit
file in the root of your GDK with the above snippet. ↩︎
320d8823
)