Create:Code Creation Group engineering overview
Introduction
Welcome to the technical overview of GitLab’s Code Suggestions, a feature designed to enhance the coding experience by integrating advanced AI technologies directly within your development environment. This page serves as your guide to understanding the architecture and interactions behind our innovative Code Suggestions feature, which significantly streamlines coding processes through intelligent completions and generative coding capabilities.
At its core, Code Suggestions operates through a sophisticated workflow involving multiple components such as IDE extensions, the Language Server, GitLab Workhorse, and our AI Gateway, all culminating in providing you with real-time, context-aware coding suggestions. From simple code completions that speed up your typing tasks to complex code generations that craft entire code blocks, our system is designed to support a wide array of coding activities and enhance productivity.
Below, we detail each component’s role in this ecosystem, describe the flow of data through our system, and explain how different types of coding interactions are handled to provide both quick suggestions and detailed code generation.
Code Suggestions Technical Overview
In most general sense Code Suggestions follow the sequence as described on the diagram below
sequenceDiagram actor USR as User participant IDE participant EXT as IDE Extension participant LS as Language Server participant GLW as GitLab Workhorse participant GLR as GitLab Rails participant AIGW as AI Gateway participant LLM as Large Language Model USR->>IDE: types: "def add(a, b)" IDE->>EXT: notify about document change def add(a, b) EXT->>LS: register document change def add(a, b) LS->>GLW: sends code suggestion requests GLW->>GLR: proxies code suggestion request GLR->>GLW: enriches code suggestion request GLW->>AIGW: enriched code suggestion request AIGW->>LLM: enriched code suggestion request LLM->>AIGW: "a + b" AIGW->>GLW: suggestion: "a + b" GLW->>LS: suggestion: "a + b" LS->>EXT: triggers IDE code suggestion UI: "a + b" EXT->>IDE: triggers IDE code suggestion UI: "a + b"
Components pictured on diagram are as follow:
- IDE extensions: GitLab offers number of IDE extensions (aka plugins) that among other features also provide integration with Language Server
- VSCode Extension: https://gitlab.com/gitlab-org/gitlab-vscode-extension/
- JetBrains Extension: https://gitlab.com/gitlab-org/editor-extensions/gitlab-jetbrains-plugin
- NeoVim Extension: https://gitlab.com/gitlab-org/editor-extensions/gitlab.vim
- Language Server: it is a unified way of delivering features that can be shared across different IDEs reducing duplication. Language Server is a component that uses the LSP protocol for communication with IDE extensions.
- GitLab Workhorse - GitLab Workhorse is a smart reverse proxy for GitLab intended to handle resource-intensive and long-running requests.
- GitLab Rails - main GitLab component providing majority of features.
- AI Gateway - a standalone-service that will give access to AI features to all users of GitLab, no matter which instance they are using: self-managed, dedicated or GitLab.com. For more conceptual information refer to architecture blueprint
- Large Language Model - a AI model that provides code generation capabilities
Code Suggestions includes two types of interactions:
- Code Completion: A short AI-generated suggestion intended to complete an existing line or block of code
- Code Generation: A longer AI-generated suggestion intended to create entire functions, classes, code blocks, etc.
Each code suggestion request is catogrised into a single category. Request categorization is performed by the Language Server before request is sent to GitLab Workhorse. If categorization is not done by Language Server then this categorization is performed by GitLab Rails.
Code Completion
Code completion interaction is one of two code creation requests that can be triggered by IDE. Its goal is to provide very fast responses (< 1 second) at the cost of smaller suggestion size, and less context awareness of surrounding source code or repository files.
By default, the code completion request is sent directly to the AI Gateway with direct connection details fetched from the GitLab Rails monolith, as shown in the sequence described below. Alternatively, the request can be routed from the GitLab Rails monolith, as shown in the diagram in the Code Suggestions technical overview.
A request prepared by the Language Server is proxied in mostly unmodified form without any additional context being attached. GitLab Rails’ role in this feature is limited to mostly an authorisation entity assuring that the given user is allowed to use the Code Suggestions feature.
Code Completion Direct Connection Diagram
sequenceDiagram actor USR as User participant IDE participant EXT as IDE Extension participant LS as Language Server participant GLR as GitLab Rails participant AIGW as AI Gateway USR->>IDE: types: "def add(a, b)" IDE->>EXT: notify about document change def add(a, b) EXT->>LS: register document change def add(a, b) LS->>LS: triggers code suggestion request alt there is unexpired direct connection details in cache? LS->>LS: fetches direct connection details from cache else LS->>GLR: requests for direct connection details GLR->>LS: returns direct connection details (AIGW url and token, expiry, model details) LS->>LS: caches direct connection details, with 1 hour expiry end LS->>AIGW: sends code suggestion requests using direct connection details AIGW->>LS: suggestion: "a + b" LS->>EXT: triggers IDE code suggestion UI: "a + b" EXT->>IDE: triggers IDE code suggestion UI: "a + b"
The components pictured on the diagram are described in the technical overview section.
Code Generation
Code Generation interaction is another type of code creation request that can be triggered by IDE. Its goal is to provide long and extensive responses generating complete blocks of code like functions or classes. It has a much longer response time than code completions (up to 30 seconds). This type of code creation request takes extended context into account when resolving the user task. This context comes from current files in IDE as well as Repository X-Ray report.
sequenceDiagram actor USR as User participant IDE participant PG as GitLab PostgreSQL DB participant GLR as GitLab Rails participant AIGW as AI Gateway USR->>+IDE: types: "#35; generate a function that transposes a matrix" IDE->>+GLR: trigger code generation for line ` "#35; generate function ` GLR->>PG: fetch X-Ray report for project and language PG->>GLR: xray_reports record GLR->>GLR: include first 300 entities from xray report into code generation prompt GLR->>-AIGW: trigger code generation ` "#35; generate function `
In above diagram some components (inc: GitLab Workhorse or Language Server) are ommitted for brevity reasons. However high level flow of requests shown in technical overview section remains unchanged.
Repository X-Ray
Repository X-Ray is a feature that generates additional context data for code generation requests. This data is used to ground the AI model into the context of existing source code and align it with its private API as well as coding patterns.
Repository X-Ray report is generated as shown on following diagram:
sequenceDiagram actor USR as User participant GIT as Gitaly participant GLR as GitLab Rails participant PG as GitLab PostgreSQL DB USR->>GLR: commits a change to the project's default branch GLR->>+GLR: triggers Repository X-Ray background job GLR->>GIT: fetches relevant files on default branch GIT->>GLR: file blobs GLR->>GLR: processes file blobs GLR->>-PG: upserts records to xray_reports
Components pictured on diagram are as follows:
- Gitaly - an application that provides high-level RPC access to Git repositories.
- GitLab PostgreSQL DB - relational database engine storing GitLab operational data.
Existing Repository X-Ray reports are included into code generation requests as shown in diagram at code generation paragraph.
Code Tasks
A user can also use one of the predefined chat commands to suggest changes in the selected code. We currently support refactoring, explaining code, and writing tests. These commands can be used in Duo Chat and also its response is displayed in Duo Chat window.
sequenceDiagram participant USR as User participant GLR as GitLab Rails participant AIGW as AI Gateway participant LLM as Large Language Model Note over USR,LLM: GraphQL API USR->>GLR: opens Duo Chat and sends `/refactor` messages with selected code GLR->>USR: accepts the request and enques it Note over USR,LLM: Asynchronous job GLR->>AIGW: builds prompt and sends AI request AIGW->>LLM: sends AI request LLM->>AIGW: receives response AIGW->>GLR: receives response GLR->>USR: sends websocket message with response which is displayed in chat
b963d865
)