AI Context Management
This page contains information related to upcoming products, features, and functionality.
It is important to note that the information presented is for informational purposes only.
Please do not rely on this information for purchasing or planning purposes.
The development, release, and timing of any products, features, or functionality may be
subject to change or delay and remain at the sole discretion of GitLab Inc.
Glossary
- AI Context. In the scope of this technical blueprint, the term “AI Context” refers to supplementary information
provided to the AI system alongside the primary prompts.
- AI Context Policy. The “AI Context Policy” is a user-defined and user-managed mechanism allowing precise
control over the content that can be sent to the AI as contextual information. In the context of this blueprint, the
AI Context Policy is suggested as a YAML configuration file.
- AI Context Policy Management. Within this blueprint, “Management” encompasses the user-driven processes of
creating, modifying, and removing AI Context Policies according to specific requirements and preferences.
- Automatic AI Context. AI Context, retrieved automatically based on the active document. *Automatic AI Contex
can be the active document’s dependencies (modules, methods, etc., imported into the active document), some
search-based, or other mechanisms over which the user has limited control.
- Supplementary User Context: User-defined AI Context, such as open tabs in IDEs, local files, and folders, that the user
provides from their local environment to extend the default AI Context.
- AI Context Retriever: A backend system capable of:
- communicating with AI Context Policy Management
- fetching content defined in Automatic AI Context and Supplementary User Context (complete files, definitions,
methods, etc.), based on the AI Context Policy Management
- correctly augment the user prompt with AI Context before sending it to LLM. Presumably, this part is already
handled by AI Gateway.
- Project Administrator. In the context of this blueprint, “Project Administrator” means any individual with the
“Edit project settings” permission (“Maintainer” or “Owner” roles, as defined in Project members permissions).
Summary
Correct context can dramatically improve the quality of AI responses. This blueprint aims to accommodate AI Context
seamlessly into our offering by architecting a solution that is ready for this additional context coming from different
AI features.
However, we recognize the importance of security and trust, which automatic solutions do not necessarily provide. To
address any concerns users might have about the content fed into the AI Context, this blueprint suggests providing them
with control and customization options. This way, users can adjust the content according to their preferences and have a
clear understanding of what information is being utilized.
This blueprint proposes a system for managing AI Context at the Project Administrator and individual
user levels. Its goal is to allow Project Administrator to set high-level rules for what content can be included as context for AI
prompts while enabling users to specify Supplementary User Context for their prompts. The global AI Context Policy will use a YAML
configuration file format stored in the same Git repository. The suggested format of the YAML configuration files
is discussed below.
Motivation
Ensuring the AI has the correct context is crucial for generating accurate and relevant code suggestions or responses.
As the adoption of AI-assisted development grows, it’s essential to give organizations and users control over what project
content is sent as context to AI models. Some files or directories may contain sensitive information that should not
be shared. At the same time, users may want to provide additional context for their prompts to get more
relevant suggestions. We need a flexible AI Context management system to handle these cases.
Goals
For Project Administrators
- Allow Project Administrators set the default AI Context Policy to control whether content can or cannot be
automatically included in the AI Context when making requests to LLMs
- Allow Project Administrators to specify exceptions to the default AI Context Policy
- Provide a UI to manage the default AI Context Policy and its exceptions list easily
For users
- Allow to set Supplementary User Context to include as AI context for their prompts
- Provide a UI to manage Supplementary User Context easily
Non-Goals
- AI Context Retriever architecture - different environments (Web, IDEs) will probably implement their retrievers.
However, the unified public interface of the retrievers should be considered.
- Extremely granular controls like allowing/excluding individual lines of code
- Storing entire file contents from user projects, only paths will be persisted
Proposal
The proposed architecture consists of 3 main parts:
- AI Context Retriever
- AI Context Policy Management
- Supplementary User Context
There are several different ongoing efforts related to various implementations of AI Context Retriever both
for Web, and for IDEs.
Because of that, the architecture for AI Context Retriever is beyond the scope of this blueprint. However, in the
context of this blueprint, it is assumed that:
- AI Context Retriever is capable of automatically retrieving and fetching Automatic AI Context and passing it
on as AI Context to LLM.
- AI Context Retriever can automatically retrieve and fetch _Supplementary User Context_and pass
it on as AI Context to LLM.
- AI Context Retriever implementation can ensure that any content passed as AI Context to a model
adheres to the global AI Context Policy.
- AI Context Retriever can trim the AI Context to meet the contextual window requirement for a
specific LLM used for that or another Duo feature.
AI Context Policy Management proposal
To implement the AI Context Policy Management system, it is proposed to:
- Introduce the YAML file format for configuring global policies
- In the YAML configuration file, support two
ai_context_policy
types:
block
: blocks all content except for the specified exclude
paths. Excluded files are allowed. (Default)
allow
: allows all content except for the specified exclude
paths. Excluded files are blocked.
version
: specifies the schema version of the AI context file. Starting with version: 1
. If omitted treated as the latest version known to the client.
- In the YAML configuration file, support glob patterns to exclude certain paths from the global policy
- Support nested AI Context Policies to provide a more granular control of AI Context in sub-folders. For
example, a policy in
/src/tests
would override a policy in /src
, which, in its turn, would override a
global AI Context Policy in /
.
Supplementary User Context proposal
To implement the Supplementary User Context system, it is proposed to:
- Introduce user-level UI to specify Supplementary User Context for prompts. A particular implementation of the UI could
differ in different environments (IDEs, Web, etc.), but the actual design of these implementations is beyond the scope of
this architecture blueprint
- The user-level UI should communicate to the user what is in the Supplementary User Context at any moment.
- The user-level UI should allow the user to edit the contents of the Supplementary User Context.
Optional steps
- Provide UI for Project Administrators to configure global AI Context Policy. Source Editor
can be used as the editor for this type of YAML file format, similar to the
Security Policy Editor.
- Implement a validation mechanism for AI Context Policies to somehow notify the Project Administrators in case
of the invalid format of the YAML configuration file. It could be a job in CI. But to catch possible issues proactively, it is
also advised to introduce the validation step as part of the
pre-push static analysis
Design and implementation details
-
YAML Configuration File Format: The proposed YAML configuration file format for defining the global
AI Context Policy is as follows:
ai_context_policy: [allow|block]
exclude:
- glob/**/pattern
The ai_context_policy
section specifies the current policy for this and all underlying folders in a repo.
The exclude
section specifies the exceptions to the ai_context_policy
. Technically, it’s an inversion of the policy.
For example, if we specify foo_bar.js
in exclude
:
- for the
allow
policy, it means that foo_bar.js
will be blocked
- for the
block
policy, it means that foo_bar.js
will be allowed
-
User-Level UI for Supplementary User Context: The UI for specifying Supplementary User Context for prompts
can be implemented differently depending on the environment (IDEs, Web, etc.). However, the implementation should
ensure users can provide additional context for their prompts. The specified Supplementary User Context for
each user can be stored as:
In both cases, the storage should allow the preference to be associated with a particular repository. Factors
like data consistency, performance, and implementation complexity should guide the decision on what type of storage
to use.
- To mitigate potential performance and scalability issues, it would make sense to keep AI Context Retriever, and
AI Context Policy Management in the same environment as the feature needing those. It would be
Language Server for Duo features in IDEs and different
services in the monolith for Duo features on the Web.
Data flow
Here’s the draft of the data flow demonstrating the role of AI Context using the Code Suggestions feature as an example.
sequenceDiagram
participant CS as Code Suggestions
participant CR as AI Context Retriever
participant PM as AI Context Policy Management
participant LLM as Language Model
CS->>CR: Request Code Suggestion
CR->>CR: Retrieve Supplementary User Context list
CR->>CR: Retrieve Automatic AI Context list
CR->>PM: Check AI Context against Policy
PM-->>CR: Return valid AI Context list
CR->>CR: Fetch valid AI Context
CR->>LLM: Send prompt with final AI Context
LLM->>LLM: Generate code suggestions
LLM-->>CS: Return code suggestions
CS->>CS: Present code suggestions to the user
In case the AI Context Retriever fails to fetch any content from the AI Context, the prompt is sent with
AI Context, which was successfully fetched. In a low-probability case, when AI Context Retriever cannot fetch any content, the prompt should be sent out as-is.
Alternative solutions
JSON Configuration Files
- Pros: Widely used, easier integration with web technologies.
- Cons: Less readable compared to YAML for complex configurations.
Database-Backed Configuration
- Pros: Centralized management, dynamic updates.
- Cons: Not version controlled.
Environment Variables
- Pros: Simplifies configuration for deployment and scaling.
- Cons: Less suitable for complex configurations.
Policy as Code (without YAML)
- Pros: Better control and auditing with versioned code.
- Cons: It requires users to write code and us to invent a language for it.
Policy in .ai_ignore
and other Git-like files
- Pros: Provides a straightforward approach, identical to the
allow
policy with the list of exclude
suggested in this blueprint
- Cons: Supports only the
allow
policy; the processing of this file type still has to be implemented
Based on these alternatives, the YAML file was chosen as a format for this blueprint because of versioning
in Git, and more versatility compared to the .ai_ignore
alternative.
Suggested iterative implementation plan
Please refer to the Proposal for a detailed explanation of the items in every iteration.
Iteration 1
- Introduce the global
.ai-context-policy.yaml
YAML configuration file format and schema for this file type
as part of AI Context Policy Management.
- AI Context Retrievers introduce support for Supplementary User Context.
- Optional: validation mechanism (like CI job and pre-push static analysis) for
.ai-context-policy.yaml
Success criteria for the iteration: Prompts sent from the Code Suggestions feature in IDEs contain
AI Context only with the open IDE tabs, which adhere to the global AI Context Policy in the root of a repository.
Iteration 2
- In AI Context Retrievers introduce support for Automatic AI Context.
- Connect more features to the AI Context Management system.
Success criteria for the iteration: Prompts sent from the Code Suggestions feature in IDEs contain AI Context
with items of Automatic AI Context, which adhere to the global AI Context Policy in the root of a repository.
Iteration 3
- Connect all Duo features on the Web and in IDEs to AI Context Retrievers and adhere to the global
AI Context Policy.
Success criteria for the iteration: All Duo features in all environments send AI Context which adheres to the
global AI Context Policy
Iteration 4
- Support nested
.ai-context-policy.yaml
YAML configuration files.
Success criteria for the iteration: AI Context Policy placed into the sub-folders of a repository, override
higher-level policies when sending prompts.
Iteration 5
- User-level UI for Supplementary User Context.
Success criteria for the iteration: Users can see and edit the contents of the Supplementary User Context and
the context is shared between all Duo features within the environment (Web, IDEs, etc.)
Iteration 6
- Optional: UI for configuring the global AI Context Policy.
Success criteria for the iteration: Users can see and edit the contents of the AI Context Policies in a UI
editor.
Summary
To manage AI Context effectively and ensure flexible and scalable solutions, AI Context Policy Management will reside in the
same environment, as the AI Context Retriever, and, as a result, as close to the context fetching mechanism as possible. This
approach aims to reduce latency and improve user control over the contextual information sent to AI systems.
Context
The original blueprint outlined the necessity of a flexible AI Context Management system to provide accurate and relevant
AI responses while addressing security and trust concerns. It suggested that AI Context Policy Management should act as
a filtering solution between the context resolver and the context fetcher in the AI Context Retriever. However, the
blueprint did not specify the exact location for the AI Context Policy Management within the system.