Modularized development components

This page contains information related to upcoming products, features, and functionality. It is important to note that the information presented is for informational purposes only. Please do not rely on this information for purchasing or planning purposes. The development, release, and timing of any products, features, or functionality may be subject to change or delay and remain at the sole discretion of GitLab Inc.

Status	Authors	Coach	DRIs	Owning Stage	Created
proposed	`gitlab.mschoenlaub` `kkloss`	`splattael`	`mgamea`	devops developer experience	2025-11-05

[!note] This is a sub-blueprint for Cloud-native development

Summary

The current architectural design of the GitLab Development Kit (GDK) poses significant scalability and ownership concerns with a risk in increasing maintenance cost for the GDK maintainers, stability concerns, and adopting modern, cloud-native technologies in development.

By moving to a declarative, self-service model, we also align development environment setups with the company-wide Component Ownership Model.

Motivation

We use the GitLab Development Kit (GDK) to set up GitLab development instances for local development and reviews. It orchestrates required and optional services, manages service configuration, and assists in troubleshooting.

Problem statement

The monolithic architecture of GDK creates significant pain points that impact development velocity and system stability, such as:

Tight coupling between component integration and orchestration code
Monolithic nature of GDK prevents self-service ownership
Absence of clear ownership boundaries
Lack of component integration/smoke tests

Current component definitions are scattered across the GDK codebase in at least seven different locations spanning multiple programming languages (Ruby, Shell, ERB templates, Makefiles, …). Due to the tight coupling of existing services and the GDK core code, adding and managing components as a component-owning team is cumbersome. Component definition being scattered and using Makefiles has frequently led to support requests from the Development Tooling team during component integration.

This absence of clear boundaries means that many components remain untested and unmaintained. When services fail, it’s unclear whether the issue lies in GDK core orchestration code or the component itself, leaving the Development Tooling team responsible for service-caused instabilities they don’t own.

Current challenges

These architectural limitations manifest in concrete challenges:

Reactive troubleshooting burden: Developers spend significant time troubleshooting GDK issues that are difficult to diagnose due to the lack of component isolation.

Slow component integration: Setting up and integrating new services takes 2–3 days for complex features such as AI services. This involves navigating the GDK codebase to find all the places that need modification, understanding implicit dependencies, and manually testing integration points without automated verification.

Unmaintained component integrations: Without clear ownership attribution, many component integrations deteriorate over time. Component owners lack visibility into the GDK integration health of their services, and the Development Tooling team lacks the domain expertise and capacity to maintain all 40+ components effectively.

Impact on business

This monolithic architecture directly impacts the GitLab development velocity:

Engineering productivity: The 2-3 day integration time for new services multiplied across teams significantly slows feature delivery, particularly for strategic initiatives like AI capabilities.
Increased maintenance costs: Around 77% of the internal support requests through Slack and 33% of bug-reports relating to GDK, which are handled by the Development Tooling team, are about component-specific issues instead of improving core GDK functionality. This friction reduces the innovation velocity of this team.
Quality and stability concerns: Integration bugs escaping to production create customer-facing incidents and erode confidence in the development workflow.

Goals

Create a decoupled, declarative interface to register GDK components.
Separate GDK components from GDK core code.
Establish clear component ownership (and stability!) attribution.
Ease the integration of future components into GDK with the declarative interface and documentation.

Non-Goals

Improve stability of components.
Physically isolate component services.

Success metrics

Service Integration Velocity: 50% reduction in effort needed to integrate new services with GDK, including the first bug-fixing iterations.
Reduced Reactive Work for Developer Experience Team: 50% reduction in number of bug-reports and support requests related to components not owned by Development Tooling.

Proposal

By modularizing GDK, we transform the current monolithic development environment architecture into a component-based architecture where each GitLab component (such as Rails, Gitaly, Workhorse, PostgreSQL, and Redis) becomes a self-contained module that allows for integration through a standardized interface and logical isolation from the core GDK framework.

Standardized component interface

Each GDK component exposes a consistent contract that defines:

Dependencies: What other components this service requires (for example, Rails requires PostgreSQL)
Configuration: Environment variables, ports, and settings the component needs
Health checks: How to verify the component is running correctly
Lifecycle hooks: Service start commands, setup instructions, and initialization procedures

New components can be integrated by implementing this standard, well-known interface without having to modify the core GDK code. This makes it much easier for component owners to integrate their component into GDK.

Migration Path

The transition to modularized GDK happens incrementally.

Foundation: Establish the base component interface standard
Smoke tests: Introduce component ownership and smoke tests for each component
Immediate results: Refactor 3-5 core services (Rails, PostgreSQL, Redis)
Default: Make modular mode the default for new GDK installations
Expansion: Convert remaining services to modular components

This approach ensures that existing GDK installations continue to work unchanged during the transition. It also provides immediate results by adding smoke tests to yet untested components and refactoring a few components as examples.

Design and implementation details

We are proposing a component definition file that records the following properties about a component in a single file per component:

Name (inferred from directory)
Owning product category
Optional: Repository
File templates
Component configuration
Automated installation commands
CI verification steps (aka. integration smoke tests)
Services
- Enabled status
- Runtime command
- Environment variables
Smoke tests
Health checks

Comparison

Property	Before	After
Ownership	Not defined	Defined in each `GDK.rb`
Repository	Defined in Makefiles and in global `config.rb`	Defined in each `GDK.rb`
File templates	In the global `support/templates/` directory	In `components/<name>/` directory
Component configuration	In global `config.rb`	Defined in each `GDK.rb`
Installation commands	In Makefiles	Defined in each `GDK.rb`
Service definition	In static `lib/gdk/services/*.rb` files	Dynamically in each `GDK.rb`

Component directory structure

Each component file and the associated file templates, such as gitaly.config.toml, are stored in the components directory in folder that’s named in kebab-case after the component name, such as gitlab-http-router. This allows GDK to infer the name from a single source of truth.

<gdk_root>
└── components/
    ├── gitaly/
    │   ├── GDK.rb                    # Component definition
    │   └── gitaly.config.toml.erb    # Configuration template
    ├── workhorse/
    │   ├── GDK.rb
    │   └── config.toml.erb
    └── redis/
        ├── GDK.rb
        └── redis.conf.erb

This structure provides clear separation of concerns and makes component ownership explicit through directory boundaries. It also enables us to use the CODEOWNERS feature of GitLab to enforce component ownership boundaries.

Code ownership

Each component directory is assigned to its owning team (never an individual) through an entry in .gitlab/CODEOWNERS:

/components/gitaly/          @...
/components/workhorse/       @...
/components/redis/           @...
/components/gitlab-rails/    @...

This assignment allows:

Automatic code review routing to the owning team, reducing load for the Development Tooling team
Ownership being defined as code and clearly visible in the GitLab UI
Teams to iterate on their component definitions without waiting for GDK maintainer availability

For third-party components (PostgreSQL, Redis, Elasticsearch), ownership defaults to the Development Tooling team, unless component-specific owners are identified.

Service runtime types

Services can specify different runtime types:

Command (default): Traditional process execution
Docker (future): Future container-based execution for isolated services
Runway (future): Support for Runway-based services

The runtime specification allows GDK to evolve toward containerized development without breaking existing components.

Component lifecycle

Components follow a standardized lifecycle:

Discovery: GDK scans the components/ directory and loads all GDK.rb files
Configuration: Component configuration is merged with user-provided settings
Dependency resolution: Component dependencies are resolved to determine initialization order
Service registration: Services are registered with the service runtime based on the configuration

After these steps, GDK can enter the components into the following workflows:

Update: GDK performs a dependency-aware update of all components, setting up those that aren’t set up yet
Reconfigure: GDK generates all config files based on the registered templates
Runtime: Services are started, stopped, and monitored according to their definitions
Verify: In CI, components are set up based on the instructions and smoke-tested

Ownership and stability attribution

Each component definition explicitly declares its owning product category through the feature_category field. This allows us to delegate ownership of components to their actual owners. This has several benefits.

When a component service fails, GDK displays the owning team information, which directs GDK users to the appropriate support channels instead of the catch-all ones maintained by Development Tooling.

This also allows us to separately track bugs of GDK Core, which alongside additions and improvements to the component definition schema and CLI tool, is the only work the Development Tooling team should be responsible for in GDK.

Likewise, we will establish error budgets—similar to production—, so that GDK stability is attributed to component owners. To visualize this, we will create a Grafana dashboard showing stability for each component, which will be used by the owners to be aware of them and remediate these themselves.

Smoke tests and health checks

We first create smoke tests for every component. These tests run in CI, where they set up a given component and verify that its service(s) are healthy. This closes an existing test gap in GDK and ensures that we don’t break service installations while migrating to the new component definitions.

Later, we will integrate health checks that are read-only and are used to report whether a service is healthy or not to the user.

We can use health checks in smoke tests but some smoke tests, such as creating a file in Gitaly and checking that it was created, are read-write operations that aren’t useful for users.

Backward compatibility

During the migration period, GDK maintains backward compatibility by supporting both the legacy service and new component registration methods simultaneously. Thanks to the smoke tests, we ensure that the migration doesn’t break existing installations.

As soon as a feature is available in the component interface, it will become the required default for new components, so we avoid a tailing backlog problem, where new services being added follow the legacy service registration.

Alternative Solutions

Assign ownership without modularization

Pros: low upfront cost.
Cons: stability stays near impossible to attribute and adding new services stays cumbersome.

Do nothing

Pros: no upfront cost.
Cons: more reactive work for Development Tooling slowing planned efforts, worse developer experience due to inaction from other teams, and adding new services stays cumbersome.

Last modified December 17, 2025: Add top-level and modularization blueprint for cloud-native development (9e91ada1)

View page source - Edit this page - please contribute.