Cells

This page contains information related to upcoming products, features, and functionality. It is important to note that the information presented is for informational purposes only. Please do not rely on this information for purchasing or planning purposes. The development, release, and timing of any products, features, or functionality may be subject to change or delay and remain at the sole discretion of GitLab Inc.
Status Authors Coach DRIs Owning Stage Created
ongoing ayufan fzimmer DylanGriffith lohrc tkuah ayufan lohrc devops data stores 2022-09-07

This document is a work-in-progress and represents a very early state of the Cells design. Significant aspects are not documented, though we expect to add them in the future.

Cells is a new architecture for our software as a service platform. This architecture is horizontally scalable, resilient, and provides a more consistent user experience. It may also provide additional features in the future, such as data residency control (regions) and federated features.

For more information about Cells, see also:

Cells Iterations

  • The Cells 1.0 target is to deliver a solution for internal customers using the SaaS GitLab.com offering, and foundational work for Cells.
  • The Cells 1.5 target is to deliver a migration solution for existing and new enterprise customers using the SaaS GitLab.com offering, built on top of the Cells 1.0 architecture.
  • The Cells 2.0 target is to support a public and open source contribution model in a cellular architecture.

Goals

See Goals, Glossary and Requirements.

Technical proposals

The Cells architecture has long lasting implications to data processing, location, scalability and the GitLab architecture. This section links all different technical proposals that are being evaluated.

Impacted features

The Cells architecture will impact many features requiring some of them to be rewritten, or changed significantly. Below is a list of known affected features with preliminary proposed solutions.

Impacted features: Placeholders

The following list of impacted features only represents placeholders that still require work to estimate the impact of Cells and develop solution proposals.

Frequently Asked Questions

What’s the difference between Cells architecture and GitLab Dedicated?

We’ve captured individual thoughts and differences between Cells and Dedicated over here

The new Cells architecture is meant to scale GitLab.com. The way to achieve this is by moving Organizations into Cells, but different Organizations can still share server resources, even if the application provides isolation from other Organizations. But all of them still operate under the existing GitLab SaaS domain name gitlab.com. Also, Cells still share some common data, like users, and routing information of Groups and Projects. For example, no two users can have the same username even if they belong to different Organizations that exist on different Cells.

With the aforementioned differences, GitLab Dedicated is still offered at higher costs due to the fact that it’s provisioned with dedicated server resources for each customer, while Cells use shared resources. This makes GitLab Dedicated more suited for bigger customers, and GitLab Cells more suitable for small to mid-size companies that are starting on GitLab.com.

On the other hand, GitLab Dedicated is meant to provide a completely isolated GitLab instance for any Organization. This instance is running on its own custom domain name, and is totally isolated from any other GitLab instance, including GitLab SaaS. For example, users on GitLab Dedicated don’t have to have a different and unique username that was already taken on GitLab.com.

Can different Cells communicate with each other?

Not directly, our goal is to keep them isolated and only communicate using global services.

How are Cells provisioned?

The GitLab.com cluster of Cells will use GitLab Dedicated tooling to create instances. Once this instance gets provisioned it could join the GitLab.com cluster and become a Cell. One requirement will be that the instance does not contain any prior data.

To reach shared resources, Cells will use Private Service Connect.

See also the design discussion.

What is a Cells topology?

See the design discussion.

How are users of an Organization routed to the correct Cell?

TBD

How do users authenticate with Cells and Organizations?

See the design discussion.

How are Cells rebalanced?

TBD

How can Cells implement disaster recovery capabilities?

TBD

How do I decide whether to move my feature to the cluster, Cell or Organization level?

By default, features are required to be scoped to the Organization level. Any deviation from that rule should be validated and approved by Tenant Scale.

The design goals of the Cells architecture describe that all Cells are under a single domain and as such, Cells are invisible to the user:

  • Cell-local features should be limited to those related to managing the Cell, but never be a feature where the Cell semantic is exposed to the customer.
  • The Cells architecture wants to freely control the distribution of Organization and customer data across Cells without impacting users when data is migrated.

Cluster-wide features are strongly discouraged because:

  • They might require storing a substantial amount of data cluster-wide which decreases scalability headroom.
  • They might require implementation of non-trivial data aggregation that reduces resilience to single node failure.
  • They are harder to build due to the need of being able to run mixed deployments. Cluster-wide features need to take this into account.
  • They might affect our ability to provide an on-premise like experience on GitLab.com.
  • Some features that are expected to be cluster-wide might in fact be better implemented using aggregation techniques that use trusted intra-cluster communication using the same user identity. For example, user Profile is shared across the cluster.
  • The Cells architecture limits what services can be considered cluster-wide. Services that might initially be cluster-wide are still expected to be split in the future to achieve full service isolation. No feature should be built to depend on such a service (like Elasticsearch).

Will Cells use the reference architecture for up to 1000 RPS or 50,000 users?

See reference architecture for up to 1000 RPS or 50,000 users.

The infrastructure team will properly size Cells depending on the load. The Tenant Scale team sees an opportunity to use GitLab Dedicated as a base for Cells deployment.

Decision log

Last modified November 13, 2024: Move gitaly pages over to data access (c16c2006)