GitLab Data Classification Standard
This is a Controlled Document
In line with GitLab’s regulatory obligations, changes to controlled documents must be approved or merged by a code owner. All contributions are welcome and encouraged.Purpose
The Data Classification Standard defines data type and categories and provides the associated Data Classification of each for the purposes of determining the level of protection to be applied to GitLab and Customer data throughout its lifecycle.
Scope
The Data Classification Standard applies to all GitLab team members, contractors, consultants, vendors and other service providers that handle, manage, store or transmit GitLab data.
Roles & Responsibilities
Role | Responsibility |
---|---|
GitLab Team Members | Responsible for adhering to the requirements outlined in this standard |
Data Owners | Responsible for approving exceptions to this standard for their owned data types. These are generally the Business Owners of a system. |
Security and Legal (Code Owners) | Responsible for approving significant changes and exceptions to this standard |
GitLab Responsibilities
-
GitLab team members, contractors, consultants, vendors and all other service providers acting on behalf of GitLab are required to review and understand this data classification standard, and how to handle data according to the classification levels below unless otherwise noted.
-
Data Owners shall determine the classification of data in accordance with this standard. The Data Classification Index (internal only) provides a list of various types of data and their classification level. If you cannot identify the data element or are uncertain of the risk associated with the data and how it should be classified and handled, please contact the Security Risk team in Slack via @security-risk.
-
To maintain our culture of security, transparency and to minimize the risk to our sensitive data and our customers, GitLab team members are required to complete Data Classification Training as part of GitLab’s Security Awareness Training to help understand the different types of data at GitLab and how to keep it SAFE. Training is available via Level Up, GitLab’s internal learning platform.
Customer Responsibilities
- GitLab customers are responsible for managing their own data, to include identification and classification according to their own internal requirements. GitLab handles Customer Data internally according to our non-disclosure obligations written in our Mutual Non Disclosure Agreement and the classifications identified in this standard.
Standard
Data Classification Definitions
-
Personal Data: Any data, individually or when combined with other data, that identifies, relates to, describes or is reasonably capable of being associated with or linked to an identifiable natural person (a ‘data subject’), whether directly or indirectly.
-
Customer Data: Refers to the electronic data, originating from the GitLab platform and supporting infrastructure, that was uploaded/created/generated by GitLab customers and processed in the GitLab application with a label of Private, Confidential, or Internal by the customer and subject to legal or contractual obligations.
Sharing of Customer-provided data within GitLab
Regardless of whether customer uploaded/created/generated data exists within private (e.g., projects, groups, sub-groups, and profiles), confidential (e.g., issues and epics), or internal (e.g., comments/notes) GitLab objects, it should not be shared with third parties unless they are an approved sub-processor, they are sharing data for required legal compliance purposes, or a separate legal approval has been obtained.Data Classification Levels
Examples of each data type: See Data Classification Index (internal only)
RED
Restricted and must remain confidential. This is GitLab’s most sensitive data and access to it should be considered privileged and must be explicitly approved. Exposure of this data to unauthorized parties could cause extreme loss to GitLab and/or its customers. In the gravest scenario, exposure of this data could trigger or cause a business extinction event.
Examples include:
- Customer Data (see definition above in the Data Classification Definitions section)
Red Data may not be transmitted from an approved Red data source to any other systems or solutions without first obtaining approval from the Privacy and Security teams. Any Vendors that process Red Data must first undergo a factual and legal analysis that justifies their processing in accordance with our Customer agreements, as well as global privacy and data security laws. For any questions or concerns related to the transmission of Red data between systems, please reach out to @Security-Risk within the #Sec-Assurance channel.
ORANGE
Data subject to laws and regulation that should not be made generally available. Unauthorized access or disclosure could cause significant or financial material loss, risk of harm to GitLab if exposed to unauthorized parties, break contractual obligations, and/or adversely impact GitLab, its partners, employees, contractors, and customers.
Examples include:
- Personal Data
- Any vendor who is in possession of any form of Personal Data must have appropriate contractual terms that address GitLab data protection requirements (e.g. a Data Processing Agreement).
- If Personal Data comprises a part of the data set to be processed, then the data classification for that data set should be Orange and the classification cannot be Yellow or Green, even if the majority of the data set is Yellow or Green data.
- The source of the Personal Data should not change its classification to a level below Orange since Personal data gathered from public sources is not exempt from protection under certain data protection laws.
- If you have doubts as to whether something is Personal Data, please see an exhaustive list of Personal Data elements in the Data Classification Index (internal only)
- GitLab Intellectual property
- Customer metadata
- Audit logs
- Open security incidents, vulnerabilities and risks
Personal Data Exception
While Personal Data is classified as Orange, there is an exception for GitLab Team Member names, their work email addresses, and their GitLab usernames, which are classified as Yellow. These two Personal Data elements are not considered high-risk or sensitive types of Personal Data. Given GitLab’s value of transparency and because GitLab is public by default, most Team Member names are available publicly. As they are often processed in support of everyday corporate operations, the application of Orange-level controls for these lower risk data elements would disproportionately inhibit GitLab’s business functions.Personal Data and Team Member Safety
Please be aware of how combining data elements could lead to impacting a Team Member’s safety. For example, Team Member names are classified as Yellow, per the classification description below. But if you combine a Team Member’s name along with their dates of travel and site for a work event, then you are possibly revealing that Team Member’s exact location, which is Orange level Personal Data. Any document or Issue containing a Team Member’s specific location should be set as “Confidential” in accordance with our Orange Data and Confidentiality Levels guidelines.YELLOW
Data and information that should not be made publicly available that is created and used in the normal course of business. Unauthorized access or disclosure could cause minimal risk or harm and/or adversely impact GitLab, its partners, employees, contractors, and customers.
Examples include:
- Asset registers
- General internal company communications
- Vendor contracts
- GitLab runbooks/work instructions/manuals/policies/procedures containing data NOT appropriate for public consumption
- GitLab Team Member names
GREEN
Data that is publicly shareable, and does not expose GitLab or its customers to any harm or material impact.
Examples include:
- GitLab handbook
- Including most GitLab runbooks/work instructions/manuals/policies/procedures
- Public announcements
- Public product information
Data Classification Standards
Credentials and access tokens are classified at the same level as the data they protect
Credentials such as passwords, personal access tokens, encryption keys, and session cookies derive their classification from the highest classification of the data they protect.
Combinations of data types may result in a higher system classification level
If there is more than one data type residing in a system, the system should be classified at the highest data classification level of the data being stored, transmitted or processed on that system.
Labeling
There is currently no internal requirement to label data according to this standard, however labels are encouraged. By labeling data according to classification level, individuals can quickly refer to this policy for proper handling.
Exceptions
Exceptions to this policy will be tracked as per the Information Security Policy Exception Management Process.
References
- Controlled Document Procedure
- Data Classification Index (internal only)
46417d02
)