Cloud Connector architecture evolution
Status | Authors | Coach | DRIs | Owning Stage | Created |
---|---|---|---|---|---|
implemented |
mkaeppler
|
ayufan
|
rogerwoo
pjphillips
|
devops data stores | 2023-09-28 |
Summary
The Cloud Connector team is now disbanded. These pages are kept for now to give historical context.
This design doc covers architectural decisions and proposed changes aligned with the team’s technical vision. Refer to the official architecture documentation for an accurate description of the current status.
Motivation
Our “big problem to solve” is to bring feature parity to our SaaS and self-managed offerings.
Until now, SaaS and self-managed (SM) GitLab instances consume features only from the
AI gateway,
which also implements an Access Layer
to verify that a given request is allowed
to access the respective AI feature endpoint.
This approach has served us well because it:
- Required minimal changes from an architectural standpoint to allow SM users to consume AI features hosted by us.
- Caused minimal friction with ongoing development on GitLab.com.
- Reduced time to market.
However, the AI gateway alone does not sufficiently abstract over a wider variety of features, as by definition it is designed to serve AI features only.
Goals
We will use this blueprint to make incremental changes to Cloud Connector’s technical framework to enable other backend services to service self-managed/GitLab Dedicated customers in the same way the AI gateway does today. This will directly support our mission of bringing feature parity to all GitLab customers.
The major areas we are focused on are:
- Provide single access point for customers.
We found that customers are not keen on configuring their web proxies and firewalls
to allow outbound traffic to an ever growing list of GitLab-hosted services. We therefore decided to
install a global, load-balanced entry point at
cloud.gitlab.com
. This entry point can make simple routing decisions based on the requested path, which allows us to target different backend services as we broaden the feature scope covered by Cloud Connector.- Status: done. The decision was documented as ADR-001.
- Remove OIDC key discovery.
The original architecture for Cloud Connector relied heavily on OIDC discovery to fetch JWT validation keys.
OIDC discovery is prone to networking and caching problems and adds complexity to solve a problem we don’t have.
Our proposed alternative to OIDC discovery is to package the public keys used for token validation from our well-known token issuers with Cloud Connector backends directly instead of fetching them over the network.
- Status: parked. We may publish a follow up ADR for an alternative approach. The decision was documented as ADR-002
- Rate-limiting features.
During periods of elevated traffic, backends integrated with Cloud Connector such as
AI gateway or TanuKey may experience resource constraints. GitLab should apply a consistent strategy when deciding which instance
should be prioritized over others. This strategy should be uniform across all Cloud Connector services.
- Status: In Progress.
- Extract CloudConnector unit_primitive configuration and logic
We will implement a new unit primitive-based configuration system by extracting it to an external library (gitlab-cloud-connector) that will serve as the Single Source of Truth (SSoT).
This library will be available as both a Ruby gem and a Python package. The decision was documented as ADR-003
- Status: In Progress.
Decisions
- ADR-001: Use load balancer as single entry point
- ADR-002: Remove OIDC key discovery
- ADR-003: Centralize Unit Primitives configuration
Cloud Connector ADR 001: Load balancer as single entry point
Cloud Connector ADR 002: Remove OIDC key discovery
1b0cbc27
)