EPSS Support

This page contains information related to upcoming products, features, and functionality. It is important to note that the information presented is for informational purposes only. Please do not rely on this information for purchasing or planning purposes. The development, release, and timing of any products, features, or functionality may be subject to change or delay and remain at the sole discretion of GitLab Inc.

Status	Authors	Coach	DRIs	Owning Stage	Created
implemented	`YashaRise`	`theoretick`	`johncrowley` `tkopel` `nilieskou`	devops secure	2024-06-19

For important terms, see glossary.

Summary

EPSS scores specify the likelihood a CVE will be exploited in the next 30 days. This data may be used to improve and simplify prioritization efforts when remediating vulnerabilities in a project. EPSS support requirements are outlined in the EPSS epic along with an overview of EPSS. This document focuses on the technical implementation of EPSS support.

EPSS scores may be populated from the EPSS Data page or through their provided API. Ultimately, EPSS scores should be reachable through the GitLab GraphQL API, as seen on the vulnerability report and details pages, and be filterable and usable when setting policies.

Package metadata database (PMDB, also known as license-db), an existing advisory pull-and-enrichment method, is for this purpose. The flow is as follows:

flowchart LR
    A[EPSS Source] -->|Pull| B[PMDB]
    B -->|Process and export| C[Bucket]
    C -->|Pull| D[GitLab Instance]

Motivation

The classic approach to vulnerability prioritization is using severity based on CVSS. This approach provides some guidance, but is too unrefined—more than half of all published CVEs have a high or critical score. Other metrics need to be employed to reduce remediation fatigue and help developers prioritize their work better. EPSS provides a metric to identify which vulnerabilities are most likely to be exploited in the near future. Combined with existing prioritization methods, EPSS helps to focus remediation efforts better and reduce remediation workload. By adding EPSS to the information presented to users, we deliver these benefits to the GitLab platform.

Goals

Enable users to use EPSS scores on GitLab as another metric for their vulnerability prioritization efforts.
Provide scalable means of efficiently repopulating recurring EPSS scores to minimize system load.

Phase 1 (MVC)

Enable access to EPSS scores through GraphQL API.

Phase 2

Show EPSS scores in vulnerability report and details pages.

Phase 3

Allow filtering vulnerabilities based on EPSS scores.
Allow creating policies based on EPSS scores.

Non-Goals

Dictate priority to users based on EPSS (or any other metric).

Proposal

Support EPSS on the GitLab platform.

Following the discussions in the EPSS epic, the proposed flow is:

PMDB database is extended with a new table to store EPSS scores.
PMDB infrastructure runs the feeder daily in order to pull and process EPSS data.
The advisory-processor receives the EPSS data and stores them to the PMDB DB.
PMDB exports EPSS data to a new PMDB EPSS bucket.
- Create a new bucket to store EPSS data.
- Delete former EPSS data once new data is uploaded, as the old data is no longer needed.
- Truncate EPSS scores to two digits after the dot.
GitLab instances pull data from the PMDB EPSS bucket.
- Create a new table in rails DB to store EPSS data.
GitLab instances expose EPSS data through GraphQL API and present data in vulnerability report and details pages.

flowchart LR
    AF[Feeder] -->|pulls| A[EPSS Source]
    AF -->|publishes| AP[Advisory Processor]
    AP -->|stores| DD[PMDB database]
    E[Exporter] -->|loads|DD
    E --> |exports| B[Public Bucket]
    GitLab[GitLab instance] --> |syncs| B
    GitLab --> |stores| GitLabDB

Design and implementation details

Decisions

Important notes

All EPSS scores get updated on a daily basis. This is pivotal to this feature’s design.
The fields retrieved from the EPSS source are cve, score, percentile. 9 digits after the dot are maintained.
- To reduce the amount of upserts, based on a spike to check magnitude of change, we will truncate EPSS scores to two digits after the dot.

PMDB

Create a new EPSS table in PMDB with an advisory identifier and the EPSS score. This includes changing the schema and any necessary migrations.
Ingest EPSS data into new PMDB table. We want to keep the EPSS data structure as close as possible to the origin so all of the data may be available to the exporter, and the exporter may choose how to process it. Therefore we will save scores and percentiles with their complete values.
Export EPSS scores in separate bucket.
- Delete the previous day’s export as it is no longer needed after the new one is added.
Add new pubsub topics to deployment to be used by PMDB components, using existing terraform modules.

GitLab Rails backend

Create table in rails backend to hold EPSS scores.
Configure Rails sync to ingest EPSS exports and save to new table.
Include EPSS data attributes in GraphQL API Occurrence objects.

GitLab UI

Add EPSS data to vulnerability report page.
Add EPSS data to vulnerability details page.
Allow filtering by EPSS score.
Allow creating policies based on EPSS score.

Alternative Solutions

Glossary

PMDB (Package metadata database, also known as License DB): PMDB is a standalone service (and not solely a database), outside of the Rails application, that gathers, stores and exports packages metadata for GitLab instances to consume. See complete documentation. PMDB components include:
- Feeder: a scheduled job called by the PMDB deployment to publish data from the relevant sources to pub/sub messages consumed by PMDB processors.
- Advisory processor: Runs as a Cloud Run instance and consumes messages published by the advisory feeder containing advisory related data and stores them to the PMDB database.
- PMDB database: a PostgreSQL instance storing license and advisory data.
- Exporter: exports license/advisory data from the PMDB database to public GCP buckets.
GitLab database: the database used by GitLab instances.
CVE (Common Vulnerabilities and Exposures): a list of publicly known information-security vulnerabilities. “A CVE” usually refers to a specific vulnerability and its CVE ID.
EPSS (Exploit prediction scoring system) score: a score ranging from 0 to 1 representing the probability of exploitation in the wild in the next 30 days of a given vulnerability.
EPSS score percentile: for a given EPSS score (of some vulnerability), the proportion of all scored vulnerabilities with the same or a lower EPSS score.

View page source - Edit this page - please contribute.