AI Model Selection

This page contains information related to upcoming products, features, and functionality. It is important to note that the information presented is for informational purposes only. Please do not rely on this information for purchasing or planning purposes. The development, release, and timing of any products, features, or functionality may be subject to change or delay and remain at the sole discretion of GitLab Inc.
Status Authors Coach DRIs Owning Stage Created
ongoing shekharpatnaik jordanjanes sean_carroll sean_carroll devops ai-powered 2024-09-11

Background

A growing number of AI models—both open-source and proprietary—are continually emerging, each with distinct characteristics such as latency, maximum token length, training approach, and output quality. Today, GitLab Duo relies on a fixed set of models for features like Code Suggestions, Chat, Duo Code Review, and Vulnerability Analysis. This rigidity limits administrators, operations teams, and end users in selecting the model that best suits their needs.

Additionally, whenever we introduce a new sub-processor or model, customers must undertake a thorough review and threat assessment process. These procedures that can span several months. This review is crucial for compliance and governance but also slows our ability to adopt new and potentially better-performing models.

In light of these challenges, we need an end-to-end solution that spans multiple feature groups to allow flexible model selection. This blueprint outlines the technical architecture required to integrate model selection options throughout GitLab Duo, providing administrators, operators, and end users with fine-grained control over which models can be used while also streamlining the governance and compliance workflow. For further background, please see Issue #513430.

Current State

Model Switching with Feature Flags:

On gitlab.com and Self-Managed instances, model selection is currently controlled by feature flags. These flags are typically toggled on or off by an administrator or operations user. Because these flags apply at the instance level rather than at a user or group level, individual users and enterprises do not have the freedom to switch between different models based on their specific tasks.

Self-Hosted Model Configuration:

For Self-Managed installations, administrators can configure self-hosted models at an instance level. While this allows some flexibility in choosing or updating models, it still does not permit end users (developers) to select models on a per-feature basis in the UI or IDE. Furthermore, .com customers currently have no governance mechanism to specify which models their organization members can use.

Phases

We can deliver this work in iterations so that we deliver value to the customer incrementally

Iteration 1: Top-level Namespace Configuration: In this phase, customers will be able to select a model for each feature at the root namespace level. This will allow .com customers to decide which models they want their namespace to use. The model used for a given action will be the one set by the top-level namespace, depending on the context the user is in (project or namespace). Related epic.

Iteration 2: Consilidatating feature release Related epic.

Future Iterations and considartion Sub-level Namespace Cascading Configuration: Child namespaces will be able to assign a specific model for each feature as they did in the top-level iteration. In addition to that, they will be able to select a subset of available models for their downstream namespaces to choose from. The model used for a given action will be the one set by the closest upstream namespace, including the current namespace, with a feature setting configured, depending on the context the user is in. We will wait for further demand before considering planning this iteration. Related Issue.

Oraganization-Level Configuration: In this phase we enable managed model configuration for .com, self-managed and dedicated at the organiz level. Supported models are stored in the AI Gateway. These models will then be retrieved by gitlab.com, Self-managed instances and dedicated instances. This is currently not planned as Organizations are not GA

Future iterations will cover the ability to let users decide which model to use for a specific feature in both the IDE and the GitLab UI. Users would be able to select from a subset selected at the namespace level. Tracking of these features can be found in this Epic.

We will build out the capabilities for model switching in Duo Code Review, Vulnerability analysis and other features. In addition to this we also want to allow self hosted customers to bring their own models.

Note: This design does not handle the scenario where we might want to pick from different recommended_models based on the user query. For example we might want to pick Google Gemini when the context length becomes very large or Claude Sonnet when its a coding related question. Dynamic model switching would have to be covered in a separate blueprint.

New Design (End State)

This represents the end state and is not broken down by phase. Every phase would deliver a section of this architecture.

Data Model

The data model is designed to provide a structured way for organizations to manage AI feature settings at different levels. It allows namespace administrators to define default AI models for various features (e.g., Code Suggestions, Chat, etc.), while also enabling individual users to override these defaults with their preferred AI models.

The objective is to balance administrative control with user flexibility, ensuring that AI-powered features align with both organizational policies and personal preferences.

erDiagram
    NAMESPACES {
        bigint id PK
        bigint parent_id FK
        bigint owner_id FK
        varchar name
        varchar path
        bigint organization_id FK
    }

    NAMESPACE_FEATURE_SETTINGS {
        bigint id PK
        bigint namespace_id FK
        timestamp created_at
        timestamp updated_at
        smallint feature
        varchar offered_model_ref
        varchar offered_model_name
    }

    NAMESPACES ||--o{ NAMESPACE_FEATURE_SETTINGS : "has many"

    USERS {
        bigint id PK
        varchar username UNIQUE
        varchar email UNIQUE
    }

    USER_AI_MODEL_PREFERENCES["USER_AI_MODEL_PREFERENCES (Future Iteration)"] {
        bigint id PK
        bigint user_id FK
        varchar feature
        varchar favourite_model
    }

    USERS ||--o{ USER_AI_MODEL_PREFERENCES : "has many"

Entities and Relationships

1. NAMESPACES
  • Represents groups/subgroups in GitLab.
  • Existing
2. NAMESPACE_FEATURE_SETTINGS
  • A set of features enabled for the namespace.
  • This table consists of a feature category (such as Duo Chat) and a feature (such Code Completion)
  • This also stores the Namespace level recommended model
  • New
  • Why?
    • We need to enable group admins to optionally set the models they want to use for each of the features.
5. NAMESPACE_AI_FEATURE_MODELS
  • Stores a list of allowed models for each feature. If this list if empty then we will show all the GitLab supported models to the user
  • Future
6. USERS
  • Represents individual users who can interact with AI features.
  • Existing
7. USER_AI_MODEL_PREFERENCES
  • Allows users to select their preferred AI model per feature, overriding the namespace default.
  • Future
  • Why?
    • Empowers users with flexibility while maintaining organizational defaults.
    • Allows for personalization of AI-assisted workflows.

Note: An organization_id field will be added to AI_SELF_HOSTED_MODELS, AI_FEATURE_SETTINGS and USER_AI_SETTINGS so that those work with cells.

Key Design Considerations

1. Default & Override Mechanism
  • Namespace Admins define default AI models for each feature.
  • Users can override these defaults for personal preferences (in the future).
  • Fallback Logic:
    1. Check USER_AI_SETTINGS for user preference.
    2. If no user preference, check namespace in NAMESPACE_FEATURE_SETTINGS. Currently we just check the top-level namespace default. In future iteration we might check for namespace selection cascadingly (from the closest parent to the root ancestor).
    3. If no namespace default, revert to a system-wide default.
  • Why?
    • Ensures a structured decision-making process.
    • Gives users autonomy without breaking organizational policies.
2. Scalability & Performance
  • We need to be able to hierarchically look up the list of models across the namespace hierarchy.
3. Incident management
  • When the default model is selected or when model selection is disabled, an incident management mechanism is being put in place. The mechanism that will be implemented in AI Gateway will re-route the call to another model when the default model can’t respond. Related issue.

  • When a model has been selected for a feature by the user, we can’t perform any incident management due to the fact that the user’s choice needs to be respected. The user needs to be aware of this behavior.

4. Flexibility for Deprecation
  • We need to be able to think about how administrators can deprecate models and what the process would be to cascade that down to other levels and the database performance implications of that.

Changes to GitLab rails

  1. The list of instance-level models is fetched from AI Gateway when a user is interacting with the model selection settings page and cached in the instance for an hour.

  2. We need a way to sync models to all cells in .com and all self managed instances. This could be done using a separate sidekiq job that will sync GitLab Managed models with each cell and the self-managed instance.

sequenceDiagram
    Self Managed->>Cloud Connector: Scheduled Sidekiq job calls API
    Cloud Connector->>AI Gateway: Fetch Model List
    AI Gateway->>AI Gateway: Fetch Model List
    AI Gateway-->>Cloud Connector: Return list of models
    Cloud Connector-->>Self Managed: Return list of models
    Self Managed->>Self Managed: Insert model records and set defaults
    Self Managed->>Administrator: Send email about new model availability
  1. We built the rails models, controllers and views as described in the Data Model section for the first itiration.

  2. Every feature when making an API call to AI Gateway needs to be able to pick a model from the list of models allowed for a specific namespace, group, etc. Every feature also needs to be able to look at the default model for the feature.

  3. We built the Group Settings page to be able to select namespace defaults from the set offered at the top level namspace. Selecting a list of models for the child namespace if for future considaration.

  4. When a model is depreciated / inactivated then we need a way to cascade the deprecations down to the namespace level. We will need to build a Sidekiq job that can do that.

  5. In rails when the user is picking a recommended_model / default_model at a namespace level they would be able to choose between the GitLab managed models (from the AI gateway config) and from the list of models configured in the self-hosted models screen.

Future changes when we allow users to pick a model:

  1. Every feature (chat, code suggestions, code review) needs to be able to display a list of available models on the UI. When the customer selects a model that should be set as the default model.

For some features such as Duo Code Review where the customer is not actively interacting with the UI, we may only allow the selection of a single model.

  1. We will need to build a new API to fetch the list of models that user is allowed to use
query {
  aiFeatureSettings {
    nodes {
      feature,
      defaultModel,
      validModels {
        nodes {
          name
        }
      }
    }
  }
}

AI Gateway Changes

The AI Gateway will have to support a new API that will allow rails to fetch the allowed list of models.

The configuration is done with this pair of files:

Both files need to be updated when we need to release or deprecate a model.

The AI Gateway already supports model passing for various APIs such as /v2/chat and /v4/suggestions.

The AI Gateway also already supports prompt versioning for different providers so we can tune the prompt in certain cases.

We will need to test our prompt changes across different model families and versions to ensure that we are backwards compatible with what customers are running. We should collect metrics across different features to figure out the most popular models so that we can focus our model tuning efforts to those.

Important note: There is currently no way with model selection to roll out a model incrementally in the AI Gateway. Once a model is configured, it will be sent to customers. Keep that in mind when adding a new model and commit changes to these files at the end of your development cycle.

Changes to Duo Workflow

In order to allow Model Switching for Duo Workflow we will need to have to create a new field showing the list of models that are available for the GitLab project. Since GitLab project is required for starting a Duo Workflow, we can use the project to determine the group / sub group information to fetch the list of allowed models. When the user selects a model that model information will need to be added to the parameters passed to Duo Workflow Executor which in turn will pass the model name to Duo Workflow Service. The protobuf contract between the Duo Workflow Service and Executor will need to have a new field called model.

message StartWorkflowRequest {
    string clientVersion = 1;
    string workflowID = 2;
    string workflowDefinition = 3;
    string goal = 4;
    string workflowMetadata = 5;
    repeated string clientCapabilities = 6;
    string model = 7;
}

In addition to this we will be need to be able to update the model factory in Duo Workflow to support different models.

IDE Changes

The IDE must call GitLab to retrieve the list of allowed models for each feature and pass the selected model in the request to AI Gateway or Rails. Related issue.

1. Model List Update

  • The IDE periodically fetches the updated list of available models or listens for a specific GitLab: Update Model List command.
  • Changes could also be triggered when the user switches GitLab accounts or when an admin updates model availability.

2. User Preferences

  • A settings screen in the IDE lets users select their default model per feature.
  • The IDE continues to pass the chosen model in chat and code suggestion requests.

Important note

Currently, when a user is assigned a seat to at least one project with a model selected for completion. The IDE disables the direct connection to AI Gateway code completion calls and goes through the GitLab monolith, which ultimately selects the model to be used according to the user’s preferences. The customers should be made aware of this through documentation.

sequenceDiagram
    alt Model Switching
        User->>IDE: Opens VSCode
        IDE->>Rails: Fetch List of Models
        Rails->>Rails: Lookup namespace of project for model list
        loop Every namespace with parent
            Rails->>Rails: Lookup parent namespace of project for model list
        end
        Rails->>Rails: Lookup instance for model list
        Rails->>IDE: Return List of models per feature and default
        IDE->>IDE: Set default models if not set
        IDE->>User: Show list of settings options in IDE
        User->>IDE: Pick model to be used for each feature
    end
    alt Code Suggestions
        User->>IDE: User performs code suggestions request
        IDE->>AIGateway: Send Model in suggestions Request
        AIGateway->>AIGateway: Check whether model is allowed from JWT claims
        AIGateway->>LLM: Get suggestions from model
        LLM-->>AIGateway: Suggestions response
        AIGateway->>AIGateway: Post processing
        AIGateway-->>IDE: Send suggestions response
        IDE->>User: Show suggestions
    end
    alt Chat
        User->>IDE: User interacts with Chat widget
        IDE->>Rails: Start Chat session
        Rails-->>IDE: Chat Session ID
        IDE->>Rails: Send user message
        Rails->>Rails: Use model in request (check model is allowed)
        Rails->>AIGateway: Make chat request with model
        AIGateway->>LLM: Send chat request
        LLM-->>AIGateway: Chat response
        LLM-->>IDE: Send chat response
        IDE->>User: Show response
    end

Note: This is a simplified diagram and does not contain all details such as Authentication/Authorization of the requests

Open Questions / Risks

Conflict Resolution

What happens if multiple namespace admins set conflicting default models at different levels of the hierarchy. Which default should we choose?

Decision: Configurations on a child namespace to take precedence over the parent namespace.

Future Model Feature Parity

Some models have different capabilities (e.g., Google Gemini 2.0 supports internet search and o1 supports thinking). How do we handle enable such sub-features?

Deprecation & Enforcement

Should we force a hard switch if an AI model is marked as deprecated, or provide a grace period?