Performance Testing for Modular Features

A contract-based approach to performance testing for modular feature services, enabling shift-left regression detection and AI-assisted analysis

Status: Draft

This page documents the performance contract schema and adoption guide. For the full system design, rationale, and open decisions, see the Performance Testing for Modular Features design document. The schema is being finalized in Milestone 1 and will be moved to its canonical repository location once environment tooling is selected in #4407.

Overview

Contract testing is the practice of defining the external surface of a service and writing a machine-readable “contract” that specifies how the service will behave. This approach provides several benefits:

Testable agreements - Automated tests verify the contract hasn’t been broken
Clear interfaces - External services can design integrations with confidence
Breaking change detection - Automated validation catches incompatible changes

Performance contracts extend this concept to the performance characteristics of a modular feature. By encoding performance targets into a validated YAML file (performance.yaml), teams gain:

Earlier regression detection - Every MR is validated against the contract
AI-aware performance governance - AI coding assistants have concrete, machine-readable performance rules
Standardized adoption - Reusable contract schema and validation toolkit for any modular feature

Scope

Performance contracts are scoped to modular feature services running in CI-accessible environments. For a full list of what is explicitly out of scope, see the Performance Testing for Modular Features design document.

Key boundaries for the current iteration:

Not a production SLO tool - Contracts inform SLOs but do not replace them
Not a local testing tool - Contract tests run in CI against a transitory environment, not on a developer’s laptop (planned for a future iteration)
Not a combination testing tool - Each service contract is validated independently; cross-service integration performance is out of scope

Architecture

The performance contract system works as:

flowchart LR
  subgraph APP[Service under test]
    CONTRACT[performance.yaml]
  end
  subgraph RUNNER[CPT - Component Performance Testing]
    VALIDATION[Schema Validation]
    ENVMAN[Environment Management]
    LOAD[Load Testing / k6]
  end
  subgraph ENV[Test Environment]
    SERVICE[Running Service]
    OBS[Observability Stack]
  end
  subgraph REPORTING[Reporting]
    RESULT[Test Results]
    AI[AI Agent Analysis]
  end
  CONTRACT --> VALIDATION
  VALIDATION --> LOAD
  ENVMAN -- standup/teardown --> ENV
  SERVICE -- metrics --> OBS
  LOAD -- HTTP requests --> SERVICE
  LOAD --> RESULT
  OBS --> AI
  RESULT --> AI
  CONTRACT --> AI
  AI --> DEV[Developer Feedback]

The performance.yaml Contract

The performance.yaml file is the single entry point for the system - it drives contract tooling, load test execution, and AI analysis. It defines:

Endpoint categories with latency percentile targets
Error rate thresholds
Resource budgets (memory, CPU, connection pool)
Database constraints (query latency, max queries per request)
SLI mappings to Prometheus metrics

Contract Tooling

CPT (Component Performance Testing) is the confirmed tool for environment management and test execution. CPT handles:

Environment lifecycle - provisioning and teardown of GCP-hosted test environments (Docker container or CNG instance) per MR run
Load test execution - running k6 tests against the service under test
MR feedback - posting test results as comments on the triggering merge request

CPT will be extended in Milestone 2 to accept performance.yaml as input and dynamically generate k6 scenarios and thresholds from the contract. The schema validation approach (whether it lives inside CPT or a separate repo) is an open question being resolved in Milestone 2. See the design document for full rationale.

Schema Definition

A performance.yaml contract is composed of the following sections:

Contract Definition (required)

This section provides tracking data about the schema and enables verifying that the contract is the current version.

version: "1.0"
service:
  name: "example-service"
  description: "Example modular feature performance contract"

element	description
`version`	Schema version for compatibility tracking
`service`	Service identification (name, description)

Endpoints (required)

Each entry represents a category of endpoints with similar performance characteristics. Routes within a category share latency targets.

endpoints:
  fast_reads:
    description: >
      Single item lookup by ID. Simulates one indexed DB read.
      This is the most common call pattern in the Artifact Registry.
    routes:
      - "GET /api/v1/items/{id}"
    metrics:
      latency_p95_ms: 100
      latency_p99_ms: 250
      error_rate_threshold: 0.001

Each endpoint category has the following elements:

element	description
`description`	human readable definition of the endpoint
`routes`	the API route to be tested
`metrics`	Performance targets measured against these routes

Performance Tiers

Performance tiers provide starting-point defaults for common service archetypes. Select the tier that best matches your endpoint, then tune based on actual baseline data:

Tier 1: Fast Reads - Simple reads with no database queries or minimal indexed lookups (health checks, status endpoints)

metrics:
  latency_p95_ms: 100
  latency_p99_ms: 250
  error_rate_threshold: 0.001

Tier 2: Standard Reads - Read operations involving database queries, joins, or moderate computation

metrics:
  latency_p95_ms: 500
  latency_p99_ms: 1000
  error_rate_threshold: 0.005

Tier 3: Write Operations - Write operations and multi-step transactions - create, update, delete endpoints, and operations that fan out to multiple services

metrics:
  latency_p95_ms: 1500
  latency_p99_ms: 3000
  error_rate_threshold: 0.01

Tier 4: Git Operations - Git protocol operations (clone, pull, push, ls-remote)

metrics:
  latency_p95_ms: 5000
  latency_p99_ms: 10000
  error_rate_threshold: 0.001

Resources (optional)

This section defines resource constraints for the test environment. Currently informational - enforcement is planned for a future iteration.

resources:
  memory_limit_mb: 256
  cpu_limit_cores: 0.5
  # Maximum concurrent connections from the service's outbound pool.
  # Maps to bench.textproto Outbound.Backend.PoolConfig.max_open.
  connection_pool_max: 10

Additional service metrics (optional)

Define metrics for any subsystems your service depends on in their own section. Currently informational - enforcement is planned for a future iteration.

If your service depends on a database, you can define it like:

database:
  # Maximum query latency at the 95th percentile (milliseconds).
  query_latency_p95_ms: 30
  # Hard limit on DB queries per inbound request. N+1 queries violate this.
  max_queries_per_request: 5

SLI mapping (optional)

Maps each contract endpoint category to the Prometheus metric names and label values emitted by the service via LabKit v2. This allows tooling (dashboards, alerting, validation scripts) to locate the right time-series without inspecting service source code.

sli_mapping:
  metrics_namespace: gitlab
  component: api

  fast_read:
    requests_total_metric: gitlab_http_requests_total
    duration_metric: gitlab_http_request_duration_seconds
    endpoint_id_label: "GET /api/v1/items/{id}"
    feature_category_label: artifact_registry

LabKit v2 and SLI Mapping

LabKit v2 is GitLab’s standard platform library for Go services. It provides the metric names, label conventions, and SLO-aligned histogram buckets that the sli_mapping section references directly. Any service already using LabKit can adopt a performance contract with zero instrumentation changes - the metrics it emits are automatically available in the observability stack for AI-assisted post-run analysis.

Adoption Workflow

Coming in Milestone 2

The adoption workflow and tooling will be available in Milestone 2 (MVP). This section will be updated with detailed instructions once the tooling is ready.

Quick Start (Planned)

Scaffold a contract - Use the scaffolding CLI to generate a starter performance.yaml
Customize targets - Adjust latency, error rate, and resource targets based on your service characteristics
Add CI integration - Include the performance contract CI template in your .gitlab-ci.yml
Validate and iterate - Push changes and review contract validation results in your MR

CI Integration (Planned)

# .gitlab-ci.yml
include:
  - project: 'gitlab-org/quality/performance-contracts'
    file: '/templates/performance-contract.yml'

Handling Metrics Not Yet in LabKit

Under Development

Guidance for handling performance aspects that LabKit does not yet emit metrics for is being developed in #4406.

For performance aspects it does not yet cover:

Document the gap - Note the missing metric in your contract with a comment
Use placeholder values - Define targets based on expected behavior
Track instrumentation work - Create issues to add missing metrics to LabKit
Validate post-deployment - Use alternative validation methods until instrumentation is available

AI Integration

Performance contracts integrate with GitLab Duo through a skill published to the GitLab Skills repo. This gives AI coding assistants:

Concrete, machine-readable performance rules
Awareness of latency budgets and resource constraints
Guidance on when to apply performance tests
Links to functional contract testing for a complete structural + performance picture

Epic: &387 Performance contracts for Modular Features
Design document: Performance Testing for Modular Features - Design Decisions
POC Repository: perf-contract-poc
POC Walkthrough: Video walkthrough
Performance Testing Tools: Tool Selection Guide

Feedback and Questions

This is an active development effort. For questions or feedback:

Comment on &387
Reach out to the Performance Enablement team
Join the discussion in the #g_performance-enablement Slack channel

Last modified May 1, 2026: Performance testing for Modular Features Design Docs (5d95503a)

View page source - Edit this page - please contribute.