Cells
Intro
Cells is a new architecture for our software as a service platform. This architecture is horizontally scalable, resilient, and provides a more consistent user experience. It may also provide additional features in the future, such as data residency control (regions) and federated features.
For more information about the goals of Cells, see goals.
Requirements and Architecture
Cells overall architecture blueprint.
Roadmap, Workstreams, and DRIs
Roadmap
|
|
|
DRIs and Stakeholders
Role | Responsibility | |
Executive Sponsor | ||
Senior Director of Engineering | ||
Director of Engineering |
|
|
Senior Engineering Manager |
|
|
Tenant Scale Engineering Manager |
|
|
Director of Product Management |
|
|
Tenant Scale Product Manager |
|
|
Staff Fullstack Engineer, Expansion |
DRI of Expansion Software Development |
Workstreams
Work stream |
Engineering DRI |
PM DRI |
TPM DRI |
Application's Cell readiness | |||
Organization for Cells | |||
Architecture | |||
Cells Services (includes Router and Topology services) | |||
Cell lifecycle automation and management | |||
Observability | |||
Application Deployment | |||
Production readiness | |||
Operations | |||
Performance validation of Cells |
Cells 1.0
All Cells 1.0 work is tracked under the Cells 1.0 Epic. The Epic is split into multiple phases where each one represents a iteration to achieve Cells 1.0. Some of these phases have dependencies over one another, and some can be run in parallel.
Phase 1: PreQA Cell
Exit Criteria:
- New GCP organizations created.
- Break glass procedure.
- Ring definition exists.
- Cell provisioned using dedicated stack.
- Able to do configuration changes to Cell.
- Cell available at
xxx.cells.gitlab.com
. - Cell doesn’t handle data uniqueness.
Unblocks:
- Phase 3: To provision runway deployment for Topology Service
- Delivery team: Start testing deploys on rings
Dependencies:
- None
Epic:
Phase 2: GitLab.com HTTPS Passthrough Proxy
Exit Criteria:
- 100% of API traffic goes through router using passthrough proxy rule.
- 100% of Web traffic goes through router using passthrough proxy rule.
- 100% of Git HTTPS traffic goes through router using passthrough proxy rule.
- Requests meet latency target
- registry.gitlab.com not proxied.
Unblocks:
- Phase 3: Router to be configured with additional rules in phase 3.
Dependencies:
- None
Epic:
Phase 3: GitLab.com HTTPS Session Routing
Exit Criteria:
- PreQA Cell configured to generate
_gitlab_session
with prefix using rails config. - Route
_gitlab_session
with matching prefix to PreQA Cell using TopologyService::Classify (REST only) with static config file. - Continuous Delivery on Ring 0 with no rollback capabilities and doesn’t block production deployments.
- Topology Service Readiness Review for Experiment
- Topology Service gRPC endpoint not implemented.
Unblocks:
Before/After:
Dependencies:
- Phase 2: Passthrough proxy needs to be deployed.
- Phase 1: GCP organizations, Ring definition exists.
Epic:
Phase 4: GitLab.com HTTPS Token Routing
Exit Criteria:
- Framework to generate routable tokens in Rails.
- Framework to classify routable tokens in HTTP Router.
- Topology Service being able to classify based on more criteria.
- Route Personal Access Tokens to different Cells using TopologyService::Classify.
- Support
PRIVATE-TOKEN:
andAuthorization:
HTTP headers for Personal Access Tokens, create issues for other to be solved in following phases. - Each routing rule added should be covered with relevant e2e tests.
- Route Job Tokens and Runner Registration to different Cells using TopologyService::Classify.
Before/After:
Epic:
Communication
Slack Channels
- #f_cells_and_organizations (internal only): Regular communication
- #cto (internal only): Weekly program status update
Meetings
- Cells Standup weekly Meeting notes (internal only)
Status updates
- Weekly “Cells & Organizations Status Update - [yyyy-mm-dd]” issues in this project
- Weekly status updates in Slack #cto channel (internal only) channel
Additional Information
Cells Fast Boot 2024
We held a Cells Fast Boot in Dublin, Ireland, between 2024-04-23 and 2024-04-24. Below are the artifacts from the event.
Agenda, Slides, and Videos
Please use the Unfiltered
Google account to watch video recordings.
- Main agenda (internal only)
- Introductions, overview, and logistics: Agenda (internal only)
- Cells Services - Global Service: Agenda (internal only), Slides (internal only), Video (internal only)
- Cells Services - Routing: Agenda (internal only), Slides (internal only), Video (internal only)
- Application Readiness - Organizations and Users: Agenda (internal only)
- Application Readiness - Dependencies and OKR alignments: Agenda (internal only)
- Deployment: Agenda (internal only), Slides (internal only), Video (internal only)
- Provisioning: Agenda (internal only)
- Observability and Runners: Agenda (internal only)
- Security: Agenda (internal only), Slides (internal only), Video (internal only)
- Disaster Recovery: Agenda (internal only), Slides (internal only), Video (internal only)
- Cells Mover and Isolation: Agenda (internal only)
- Scalability Headroom and Timeline: Agenda (internal only)
Decisions
- No external customers on Cells 1.0, internal dogfooding only. Cells 1.x is the target to onboard new or existing external customers.
Artifacts
- Day 1 recording: Part 1 (internal only), Part 2 (internal only)
- Day 2 recording (internal only)
- Database breakout recording (internal only)
- Organizations breakout recording (internal only)
8b8aa7b4
)