Performance Testing for Modular Features

A contract-based approach to performance testing for modular feature services, enabling shift-left regression detection and AI-assisted analysis

Overview

Contract testing is the practice of defining the external surface of a service and writing a machine-readable “contract” that specifies how the service will behave. This approach provides several benefits:

  • Testable agreements - Automated tests verify the contract hasn’t been broken
  • Clear interfaces - External services can design integrations with confidence
  • Breaking change detection - Automated validation catches incompatible changes

Performance contracts extend this concept to the performance characteristics of a modular feature. By encoding performance targets into a validated YAML file (performance.yaml), teams gain:

  • Earlier regression detection - Every MR is validated against the contract
  • AI-aware performance governance - AI coding assistants have concrete, machine-readable performance rules
  • Standardized adoption - Reusable contract schema and validation toolkit for any modular feature

Scope

Performance contracts are scoped to modular feature services running in CI-accessible environments. For a full list of what is explicitly out of scope, see the Performance Testing for Modular Features design document.

Key boundaries for the current iteration:

  • Not a production SLO tool - Contracts inform SLOs but do not replace them
  • Not a local testing tool - Contract tests run in CI against a transitory environment, not on a developer’s laptop (planned for a future iteration)
  • Not a combination testing tool - Each service contract is validated independently; cross-service integration performance is out of scope

Architecture

The performance contract system works as:

flowchart LR
  subgraph APP[Service under test]
    CONTRACT[performance.yaml]
  end
  subgraph RUNNER[CPT - Component Performance Testing]
    VALIDATION[Schema Validation]
    ENVMAN[Environment Management]
    LOAD[Load Testing / k6]
  end
  subgraph ENV[Test Environment]
    SERVICE[Running Service]
    OBS[Observability Stack]
  end
  subgraph REPORTING[Reporting]
    RESULT[Test Results]
    AI[AI Agent Analysis]
  end
  CONTRACT --> VALIDATION
  VALIDATION --> LOAD
  ENVMAN -- standup/teardown --> ENV
  SERVICE -- metrics --> OBS
  LOAD -- HTTP requests --> SERVICE
  LOAD --> RESULT
  OBS --> AI
  RESULT --> AI
  CONTRACT --> AI
  AI --> DEV[Developer Feedback]

The performance.yaml Contract

The performance.yaml file is the single entry point for the system - it drives contract tooling, load test execution, and AI analysis. It defines:

  • Endpoint categories with latency percentile targets
  • Error rate thresholds
  • Resource budgets (memory, CPU, connection pool)
  • Database constraints (query latency, max queries per request)
  • SLI mappings to Prometheus metrics

Contract Tooling

CPT (Component Performance Testing) is the confirmed tool for environment management and test execution. CPT handles:

  • Environment lifecycle - provisioning and teardown of GCP-hosted test environments (Docker container or CNG instance) per MR run
  • Load test execution - running k6 tests against the service under test
  • MR feedback - posting test results as comments on the triggering merge request

CPT will be extended in Milestone 2 to accept performance.yaml as input and dynamically generate k6 scenarios and thresholds from the contract. The schema validation approach (whether it lives inside CPT or a separate repo) is an open question being resolved in Milestone 2. See the design document for full rationale.

Schema Definition

A performance.yaml contract is composed of the following sections:

Contract Definition (required)

This section provides tracking data about the schema and enables verifying that the contract is the current version.

version: "1.0"
service:
  name: "example-service"
  description: "Example modular feature performance contract"
element description
version Schema version for compatibility tracking
service Service identification (name, description)

Endpoints (required)

Each entry represents a category of endpoints with similar performance characteristics. Routes within a category share latency targets.

endpoints:
  fast_reads:
    description: >
      Single item lookup by ID. Simulates one indexed DB read.
      This is the most common call pattern in the Artifact Registry.
    routes:
      - "GET /api/v1/items/{id}"
    metrics:
      latency_p95_ms: 100
      latency_p99_ms: 250
      error_rate_threshold: 0.001

Each endpoint category has the following elements:

element description
description human readable definition of the endpoint
routes the API route to be tested
metrics Performance targets measured against these routes

Performance Tiers

Performance tiers provide starting-point defaults for common service archetypes. Select the tier that best matches your endpoint, then tune based on actual baseline data:

  • Tier 1: Fast Reads - Simple reads with no database queries or minimal indexed lookups (health checks, status endpoints)
metrics:
  latency_p95_ms: 100
  latency_p99_ms: 250
  error_rate_threshold: 0.001
  • Tier 2: Standard Reads - Read operations involving database queries, joins, or moderate computation
metrics:
  latency_p95_ms: 500
  latency_p99_ms: 1000
  error_rate_threshold: 0.005
  • Tier 3: Write Operations - Write operations and multi-step transactions - create, update, delete endpoints, and operations that fan out to multiple services
metrics:
  latency_p95_ms: 1500
  latency_p99_ms: 3000
  error_rate_threshold: 0.01
  • Tier 4: Git Operations - Git protocol operations (clone, pull, push, ls-remote)
metrics:
  latency_p95_ms: 5000
  latency_p99_ms: 10000
  error_rate_threshold: 0.001

Resources (optional)

This section defines resource constraints for the test environment. Currently informational - enforcement is planned for a future iteration.

resources:
  memory_limit_mb: 256
  cpu_limit_cores: 0.5
  # Maximum concurrent connections from the service's outbound pool.
  # Maps to bench.textproto Outbound.Backend.PoolConfig.max_open.
  connection_pool_max: 10

Additional service metrics (optional)

Define metrics for any subsystems your service depends on in their own section. Currently informational - enforcement is planned for a future iteration.

If your service depends on a database, you can define it like:

database:
  # Maximum query latency at the 95th percentile (milliseconds).
  query_latency_p95_ms: 30
  # Hard limit on DB queries per inbound request. N+1 queries violate this.
  max_queries_per_request: 5

SLI mapping (optional)

Maps each contract endpoint category to the Prometheus metric names and label values emitted by the service via LabKit v2. This allows tooling (dashboards, alerting, validation scripts) to locate the right time-series without inspecting service source code.

sli_mapping:
  metrics_namespace: gitlab
  component: api

  fast_read:
    requests_total_metric: gitlab_http_requests_total
    duration_metric: gitlab_http_request_duration_seconds
    endpoint_id_label: "GET /api/v1/items/{id}"
    feature_category_label: artifact_registry

LabKit v2 and SLI Mapping

LabKit v2 is GitLab’s standard platform library for Go services. It provides the metric names, label conventions, and SLO-aligned histogram buckets that the sli_mapping section references directly. Any service already using LabKit can adopt a performance contract with zero instrumentation changes - the metrics it emits are automatically available in the observability stack for AI-assisted post-run analysis.

Adoption Workflow

Quick Start (Planned)

  1. Scaffold a contract - Use the scaffolding CLI to generate a starter performance.yaml
  2. Customize targets - Adjust latency, error rate, and resource targets based on your service characteristics
  3. Add CI integration - Include the performance contract CI template in your .gitlab-ci.yml
  4. Validate and iterate - Push changes and review contract validation results in your MR

CI Integration (Planned)

# .gitlab-ci.yml
include:
  - project: 'gitlab-org/quality/performance-contracts'
    file: '/templates/performance-contract.yml'

Handling Metrics Not Yet in LabKit

For performance aspects it does not yet cover:

  • Document the gap - Note the missing metric in your contract with a comment
  • Use placeholder values - Define targets based on expected behavior
  • Track instrumentation work - Create issues to add missing metrics to LabKit
  • Validate post-deployment - Use alternative validation methods until instrumentation is available

AI Integration

Performance contracts integrate with GitLab Duo through a skill published to the GitLab Skills repo. This gives AI coding assistants:

  • Concrete, machine-readable performance rules
  • Awareness of latency budgets and resource constraints
  • Guidance on when to apply performance tests
  • Links to functional contract testing for a complete structural + performance picture

Feedback and Questions

This is an active development effort. For questions or feedback:

  • Comment on &387
  • Reach out to the Performance Enablement team
  • Join the discussion in the #g_performance-enablement Slack channel