Retrieval Augmented Generation (RAG) for GitLab Duo on self-managed

This page contains information related to upcoming products, features, and functionality. It is important to note that the information presented is for informational purposes only. Please do not rely on this information for purchasing or planning purposes. The development, release, and timing of any products, features, or functionality may be subject to change or delay and remain at the sole discretion of GitLab Inc.
Status Authors Coach DRIs Owning Stage Created
proposed shinya.maeda mikolaj_wawrzyniak stanhu pwietchner oregand tlinz devops ai-powered 2024-01-25

RAG is an application architecture used to provide knowledge to a large language model that doesn’t exist in its training set, so that it can use that knowledge to answer user questions. To learn more about RAG, see RAG for GitLab.

Goals of this blueprint

This blueprint aims to drive a decision for a RAG solution for GitLab Duo on self-managed, specifically for shipping GitLab Duo with access to GitLab documentation. We outline three potential solutions, including PoCs for each to demonstrate feasibility for this use case.

Constraints

  • The solution must be viable for self-managed customers to run and maintain
  • The solution must be shippable in 1-2 milestones
  • The solution should be low-lock-in, since we are still determining our long term technical solution(s) for RAG at GitLab

Proposals for GitLab Duo Chat RAG for GitLab documentation

The following solutions have been proposed and evaluated for the GitLab Duo Chat for GitLab documentation use case:

You can read more about how each evaluatoin was conducted in the links above.

Chosen solution

Vertex AI Search is going to be implemented due to the low lock-in and being able to reach customers quickly. It could be moved over to another solution in the future.


Elasticsearch
For more information on Elasticsearch and RAG broadly, see the Elasticsearch article in RAG at GitLab. Retrieve GitLab Documentation A proof of concept was done to switch the documentation embeddings from being stored in the embedding database to being stored on Elasticsearch. Synchronizing embeddings with data source The same procedure used by PostgreSQL can be followed to keep the embeddings up to date in Elasticsearch. Retrieval To get the nearest neighbours, the following query can be executed an index containing the embeddings:
PostgreSQL
Retrieve GitLab Documentation PGVector is currently being used for the retrieval of relevant documentation for GitLab Duo chat’s RAG. A separate embedding database runs alongside geo and main which has the pg-vector extension installed and contains embeddings for GitLab documentation. Statistics (as of January 2024): Data type: Markdown written in natural language (Unstructured) Data access level: Green (No authorization required) Data source: https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc Data size: 147 MB in vertex_gitlab_docs. 2194 pages.
Vertex AI Search
Retrieve GitLab Documentation Statistics (as of January 2024): Date type: Markdown (Unstructured) written in natural language Date access level: Green (No authorization required) Data source: https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc Data size: approx. 56,000,000 bytes. 2194 pages. Service: https://docs.gitlab.com/ (source repo Example of user input: “How do I create an issue?” Example of expected AI-generated response: “To create an issue:\n\nOn the left sidebar, select Search or go to and find your project.\n\nOn the left sidebar, select Plan > Issues, and then, in the upper-right corner, select New issue.
Last modified August 23, 2024: Ensure frontmatter is consistent (e47101dc)