AI Model Migration Playbook

How to migrate an AI model at GitLab

Introduction

LLM models are constantly changing and Gitlab needs to constantly update our AI features to support newer models. The following guide should give a general approach on how to update our AI features to reduce the time to migrate Gitlab tools.

Purpose

Provide a guide for migrating AI models within GitLab.

Scope

Applicable to all AI model-related teams at GitLab. We currently only support using Anthropic and Google Vertex models.

Migration Tasks

Migration Tasks for updating Anthropic Model:

  1. Optional - Investigate the new model is supported within our current AI-Gateway API specification. The following step can be usually be skipped. However, sometimes to support a newer model, we may need accommodate a new API format. If that’s the case, please follow these steps.

    • Make sure the newer model is supported exsiting Messages API. The migration of Claude 2.1 to Claude 3.0 required a change from the Text Completions API to Messages API.
  2. Add the new model to our available models list.

  3. Change the default model in our AI-Gateway client. Please do place the change around a feature flag. We may need to quickly rollback the change.

  4. Each tool we have in ee/lib/gitlab/llm/chain/tools/* will either pass a model or default to the existing zero_shot model. If we pass a model through the prompt options, you should the options key to the newer model.

    • Note: There’s not an exact science to which model to select for each tool. Please see the testing section to how to evaluate

Migration tasks for updating Vertex models:

** Work in Progress*

Scope the Work

AI Features to Migrate

  • Duo Chat Tools:
    • ci_editor_assistant/prompts/anthropic.rb - CI Editor
    • gitlab_documentation/executor.rb - Gitlab Documentation
    • epic_reader/prompts/anthropic.rb - Epic Reader
    • issue_reader/prompts/anthropic.rb - Issue Reader
    • merge_request_reader/prompts/anthropic.rb - Merge Request Reader
  • Chat Slash Commands:
    • refactor_code/prompts/anthropic.rb - Refactor
    • write_tests/prompts/anthropic.rb - Write Tests
    • explain_code/prompts/anthropic.rb - Explain Code
    • explain_vulnerability/executor.rb - Explain Vulnerability
  • Experimental Tools:
    • Summarize Comments Chat
    • Fill MR Description

API Changes

  • Support new message API.
  • Update API endpoints as needed.

Detailed Tasks

Chat Tools

  • ci_editor_assistant/prompts/anthropic.rb - CI Editor
  • gitlab_documentation/executor.rb - Gitlab Documentation
  • epic_reader/prompts/anthropic.rb - Epic Reader
  • issue_reader/prompts/anthropic.rb - Issue Reader
  • NOT PUT INTO PRODUCTION YET

Chat Slash Commands

  • refactor_code/prompts/anthropic.rb - Refactor
  • write_tests/prompts/anthropic.rb - Write tests
  • explain_code/prompts/anthropic.rb - Explain Code
  • explain_vulnerability/executor.rb - Explain Vulnerability

Experimental Tools

Testing

Model Evaluation

The ai-model-validation team created the following library to evaluate the performance of prompt changes as well as model changes. The following Prompt Library README.MD provides details on how to evaluate the performance of AI feature.

Another use-case for running chat evaluation is during feature development cycle. The purpose is to verify how the changes to the code base and prompts affect the quality of chat responses before the code reaches the production environment.

Local Development

A very valuable tool for local development to ensure the changes are correct outside of unit tests is to LangChain for tracing. The tool allows you to trace LLM calls within Duo Chat to verify the LLM tool is using the correct model.

To prevent regressions, we also have CI jobs to make sure our tools are working correctly. For more details, see the following Duo Chat testing section.

Last modified June 28, 2024: chore: add model migration playbook (7a724610)