Security Architects are the trusted security advisors of GitLab Engineering. Security Architecture is a natural extension of the greater Architecture initiative at GitLab. It is the preliminary and necessary work to build software with security considerations.
Security Architecture protects the organization from cyber harm, and support present and future business needs by:
- Preventing Security from being an afterthought
- Conducting Security Architecture reviews
- Defining Security Architecture Principles
- Aligning with our security sub-departments requirements and expectations
- Assisting other departments in the design and architect of new features, services, products.
- Identifying the right SMEs and DRIs on the security side
- Driving security initiatives and features
The process is designed with these constraints in mind:
- aligned with our values
- self-service as much as possible
- avoid being a bottleneck in the software development life cycle
- deliberately simple and concise
- automated as much as possible
- DRY with strong notes
Scope of Security Architecture
Any change in our product offering (whether it is a feature, a service, or an acquisition), that would impact our security posture. Our security posture is defined by:
- the components we build upon
- the components we embed
- everything infrastructure
- 3rd party services
- software architecture
- reference architectures
Security Architecture Requirements
The Application Security team provides guidelines and requirements to follow during all the life cycle of source code:
- Do not roll your own crypto (also one of our Security Architecture Principles )
- Reference our GitLab Cryptography Standard
Security Architecture Principles
The Security Architecture Principles are not requirements nor decisions, but something in between.
Our principles are based on two simple pillars:
- Least privilege
- Network isolation
They are detailed below with the principles taken from the book Software Systems Architecture (see references) and this ACCU 2019 related video. These are very close to the OWASP Security Design Principles but are easier to understand and apply.
Assign the least privilege possible
Broad privileges allow malicious or accidental access to protected resources.
- Give only the minimum level of access rights (privileges) that is necessary to a user or service to complete an assigned operation. This right must be given only for a minimum amount of time that is necessary to complete the operation.
- Do not use administrative accounts for application access
- Use separate accounts for sensitive data
- Run service processes as their own users with exactly the set of privileges they require
- Grant read-only permissions when no updates are required
- When updates are required, limit to the scope to the target resource only
Limit the blast radius of successful attacks: When one part of the system is compromised, the whole system is not.
Make attacks less attractive.
- Compartmentalize responsibilities and privileges
- Separation of duties: the successful completion of a single task is dependent upon two or more conditions
- Don’t store secrets along with other non-sensitive data (like settings), even if secrets are filtered out
- A system/service that only needs to read git commits should not be able to access user data
- GitLab team members don’t have access to billing data, nor anything else classified red data
- Many security problems caused by inserting malicious intermediaries in communication path
- Assume unknown entities are untrusted
- Have a clear process to establish trust
- Validate who or what is connecting
- Always use a kind of authentication (certificate, password, …)
- Network controls
- Do not dynamically load 3rd party code
- Services can’t be considered as secure as soon as they are not exposed to the Internet. SSRF can let attackers freely access them.
- The best way to authenticate users is to apply this general security principle: Provide something you know (ex: password), and something you own (ex: certificate). This is what we apply with MFA, for example by providing a password you know, along with a TOTP that is generated by an application.
- Downloading 3rd party libraries or scripts at runtime can lead to many security issues, including cache poisoning, XSS, and whatnot. Without checking the integrity of the external asset, malicious actors can tamper the files, like this example of BGP Hijacking
- Zero Trust at GitLab
Simplest solution possible
- Simple solutions are easier to deploy, maintain, and secure
- Aligned with our Iteration and Efficiency values
- Security requires understanding of the design
- Complexity increases exponentially
- Attack-ability or attack surface of the software is reduced
- Avoid complex failure modes, implicit behaviours, unnecessary features
- Use well-known, tested, and proven components
- Avoid over-engineering and strive for MVCs instead
- Introducing a new server in GitLab means updating Omnibus builds, Helm charts, our reference architectures, our docs, and so on. This is something to balance carefully against the benefits of adding a component which seem to be a perfect fit.
Audit sensitive events
- Provide record of activity
- Deter wrong doing
- Provide a log to construct that past
- Provide a monitoring point
- Record all security significant events in a tamper-resistant store
- Provide notifications for all sensitive events
- Enable GuardDuty in AWS or Cloud Audit Logs in GCP to record activity and detect malicious intent.
- Leverage Panther (for gitlab.com only) to collect, normalize, and analyze logs.
- Provide notifications to users when:
- Changes to their accounts
- New keys generated or added to their accounts
- Generate security events (could be Slack notifications) for unusual activity:
- Signal passing a threshold (rate limiting in action)
- Component signature not matching
- Unauthorized access to sensitive resources
Fail securely & use secure defaults
- Default passwords, ports and rules are “open doors”
- Failure and restart states often default to “insecure”
- Force changes to security sensitive parameters
- Think through failures - to be secure but recoverable
- Unless a subject is given explicit access to an object, it should be denied access to that object, aka Fail Safe Defaults.
- Do not trust invalid/expired TLS certificates
- Some components like Grafana come with a default
- Related to above, some components might fail over to a plain user/password authentication (with default credentials) under certain conditions, like a service not reachable.
- Some frameworks tend to render error pages with details that should not be shared, like hostnames and paths, when they cannot connect to some resources.
Never rely upon obscurity
- Hiding things is difficult, someone is going to find them, accidentally or on purpose
- We’re a very transparent company and are more likely to share implementation details, sometimes leaking something sensitive.
- Offboarded employees leave with sensitive knowledge. While tokens can be rotated, we can’t ensure this knowledge won’t leak
- Assume attacker with perfect knowledge
- Recon can help attackers find servers that are not publicly documented. These servers could expose vulnerable components, and lead to east-west movement.
- Changing the path to a admin section won’t prevent attackers from finding it eventually.
Implement defense in depth
- Systems do get attacked, breaches do happen, mistakes are made
- Minimize blast radius: One component compromised should not compromise the whole system
- Prevent SSRF
- Don’t rely on a single point/layer of security:
- Secure every level
- Stop failures at one level propagating
- Encrypt data at rest and in transit
- Use vulnerability scanners
- Close unnecessary ports and disable unused features
- A resource is well protected when accessed via the UI, but could be more exposed via the API.
- Accounts are locked when too many attempts, in order to avoid brute-force attacks.
- OS execution can lead to bypass all application security layers, because the execution occurs outside of the application.
- Unnecessary open ports and enabled features may lead to authentication bypass and other weaknesses. They increase the exposure of an application.
Never invent security technology
- Security technology is difficult to create, and avoiding vulnerabilities is difficult
- It takes years to secure and mature new security technologies
- They are expected to be perfect (sort of)
- Do not roll your own crypto
- Use well-known and proven components
- In doubt, always involve the right SMEs
- Do not implement SSO from scratch
Find the weakest link
- A system is just as secure as its weakest link
- Over time, new vulnerabilities are discovered, and a component might suddenly become the new weak link
- Threat model the system, repeat, iterate.
- Identify central components that
- share more privileges than the others
- have more connections to other components
- are entrypoints (login modules, APIs, …)
- Run Dependency Scanning
- Avoid weak ciphers and algorithms
- Sometimes consider the humans (users) as the weakest link. Phishing is still widely used for a good reason
- Some resources are very well protected in the UI, and never exposed to unauthorized users. Yet, if the API is not correctly implementing security controls, these resources could be passed as raw models without filtering sensitive data.
- Data encrypted in transit but not at rest.
- The weakest link could also be a user. Not enforcing strong passwords and MFA could lead to sensitive data exposure, but users can also do harmful actions without being aware of it.
- OS (system) commands often leads to bypassing most, if not all, the security controls of an applicaton. It is a common vector for RCEs and should be avoided as much as possible.
Security Architecture reviews
As part of the Production Readiness Process, it is highly recommended to include a Security Architecture review.
The Security Architecture review process is detailed in this page.
Security Architecture, by nature, doesn’t generate measurable data, apart from the number of architecture diagrams and reviews. While this could be used as a metric, it’s only reflecting work load, and not achievements. Instead, we are measuring success in terms of maturity.
We are targeting the Maturity Level 1 for FY23-Q1, and our roadmap is discussed in this issue.
- Slack: #security-architecture
- GitLab namespace: https://gitlab.com/gitlab-com/gl-security/security-architecture
- Software Systems Architecture (ISBN-13: 978-0321718334)
- NIST CSF
- CIS Critical Security Controls
- OWASP Cyber Defense Matrix
- AWS Well Architected Framework
- OWASP Developer Guide Reboot
- Google Cloud: Optimize your system design using Architecture Framework Principles