EPSS Support
Status | Authors | Coach | DRIs | Owning Stage | Created |
---|---|---|---|---|---|
implemented |
YashaRise
|
theoretick
|
johncrowley
tkopel
nilieskou
|
devops secure | 2024-06-19 |
For important terms, see glossary.
Summary
EPSS scores specify the likelihood a CVE will be exploited in the next 30 days. This data may be used to improve and simplify prioritization efforts when remediating vulnerabilities in a project. EPSS support requirements are outlined in the EPSS epic along with an overview of EPSS. This document focuses on the technical implementation of EPSS support.
EPSS scores may be populated from the EPSS Data page or through their provided API. Ultimately, EPSS scores should be reachable through the GitLab GraphQL API, as seen on the vulnerability report and details pages, and be filterable and usable when setting policies.
Package metadata database (PMDB, also known as license-db), an existing advisory pull-and-enrichment method, is for this purpose. The flow is as follows:
flowchart LR A[EPSS Source] -->|Pull| B[PMDB] B -->|Process and export| C[Bucket] C -->|Pull| D[GitLab Instance]
Motivation
The classic approach to vulnerability prioritization is using severity based on CVSS. This approach provides some guidance, but is too unrefined—more than half of all published CVEs have a high or critical score. Other metrics need to be employed to reduce remediation fatigue and help developers prioritize their work better. EPSS provides a metric to identify which vulnerabilities are most likely to be exploited in the near future. Combined with existing prioritization methods, EPSS helps to focus remediation efforts better and reduce remediation workload. By adding EPSS to the information presented to users, we deliver these benefits to the GitLab platform.
Goals
- Enable users to use EPSS scores on GitLab as another metric for their vulnerability prioritization efforts.
- Provide scalable means of efficiently repopulating recurring EPSS scores to minimize system load.
Phase 1 (MVC)
- Enable access to EPSS scores through GraphQL API.
Phase 2
- Show EPSS scores in vulnerability report and details pages.
Phase 3
- Allow filtering vulnerabilities based on EPSS scores.
- Allow creating policies based on EPSS scores.
Non-Goals
- Dictate priority to users based on EPSS (or any other metric).
Proposal
Support EPSS on the GitLab platform.
Following the discussions in the EPSS epic, the proposed flow is:
- PMDB database is extended with a new table to store EPSS scores.
- PMDB infrastructure runs the feeder daily in order to pull and process EPSS data.
- The advisory-processor receives the EPSS data and stores them to the PMDB DB.
- PMDB exports EPSS data to a new PMDB EPSS bucket.
- Create a new bucket to store EPSS data.
- Delete former EPSS data once new data is uploaded, as the old data is no longer needed.
- Truncate EPSS scores to two digits after the dot.
- GitLab instances pull data from the PMDB EPSS bucket.
- Create a new table in rails DB to store EPSS data.
- GitLab instances expose EPSS data through GraphQL API and present data in vulnerability report and details pages.
flowchart LR AF[Feeder] -->|pulls| A[EPSS Source] AF -->|publishes| AP[Advisory Processor] AP -->|stores| DD[PMDB database] E[Exporter] -->|loads|DD E --> |exports| B[Public Bucket] GitLab[GitLab instance] --> |syncs| B GitLab --> |stores| GitLabDB
Design and implementation details
Decisions
Important notes
- All EPSS scores get updated on a daily basis. This is pivotal to this feature’s design.
- The fields retrieved from the EPSS source are
cve
,score
,percentile
. 9 digits after the dot are maintained.- To reduce the amount of upserts, based on a spike to check magnitude of change, we will truncate EPSS scores to two digits after the dot.
PMDB
- Create a new EPSS table in PMDB with an advisory identifier and the EPSS score. This includes changing the schema and any necessary migrations.
- Ingest EPSS data into new PMDB table. We want to keep the EPSS data structure as close as possible to the origin so all of the data may be available to the exporter, and the exporter may choose how to process it. Therefore we will save scores and percentiles with their complete values.
- Export EPSS scores in separate bucket.
- Delete the previous day’s export as it is no longer needed after the new one is added.
- Add new pubsub topics to deployment to be used by PMDB components, using existing terraform modules.
GitLab Rails backend
- Create table in rails backend to hold EPSS scores.
- Configure Rails sync to ingest EPSS exports and save to new table.
- Include EPSS data attributes in GraphQL API Occurrence objects.
GitLab UI
- Add EPSS data to vulnerability report page.
- Add EPSS data to vulnerability details page.
- Allow filtering by EPSS score.
- Allow creating policies based on EPSS score.
Alternative Solutions
Glossary
- PMDB (Package metadata database, also known as License DB): PMDB is a standalone service (and not solely a database), outside of the Rails application, that gathers, stores and exports packages metadata for GitLab instances to consume. See complete documentation. PMDB components include:
- Feeder: a scheduled job called by the PMDB deployment to publish data from the relevant sources to pub/sub messages consumed by PMDB processors.
- Advisory processor: Runs as a Cloud Run instance and consumes messages published by the advisory feeder containing advisory related data and stores them to the PMDB database.
- PMDB database: a PostgreSQL instance storing license and advisory data.
- Exporter: exports license/advisory data from the PMDB database to public GCP buckets.
- GitLab database: the database used by GitLab instances.
- CVE (Common Vulnerabilities and Exposures): a list of publicly known information-security vulnerabilities. “A CVE” usually refers to a specific vulnerability and its CVE ID.
- EPSS (Exploit prediction scoring system) score: a score ranging from 0 to 1 representing the probability of exploitation in the wild in the next 30 days of a given vulnerability.
- EPSS score percentile: for a given EPSS score (of some vulnerability), the proportion of all scored vulnerabilities with the same or a lower EPSS score.
EPSS Support ADR 002: Use a new bucket for EPSS data
EPSS Support ADR 003: Switched from EPSS API to CSV File
eef3c341
)