GitLab Query Language (GLQL)
Status | Authors | Coach | DRIs | Owning Stage | Created |
---|---|---|---|---|---|
proposed |
himkp
|
ntepluhina
|
johnhope
mmacfarlane
vshushlin
|
devops plan | 2024-09-25 |
Summary
GitLab Query Language (GLQL) is born out of the need for a unified, powerful, and user-friendly way to query and present data across the entire GitLab platform. As GitLab continues to evolve and expand its feature set, the ability to efficiently access, filter, and visualize information becomes increasingly crucial for users to manage their workflows effectively.
Motivation
GLQL addresses the growing need for a unified and powerful way to access and visualize data across the GitLab platform. As GitLab’s feature set expands, GLQL aims to simplify complex data retrieval tasks, enhance user productivity, and provide a consistent querying experience. This initiative positions GitLab as a comprehensive data management solution within the DevOps ecosystem, catering to the evolving needs of developers and organizations managing large-scale projects.
Goals
The primary motivations behind GLQL are:
-
Unified Query Interface: GLQL aims to provide a single, consistent syntax for querying various GitLab objects, starting with issues and expanding to work items.
-
Enhanced Data Accessibility: By embedding GLQL blocks in text editors anywhere across GitLab, GLQL empowers users to access precisely the information they need, when they need it. This is especially essential for large-scale projects that manage complex workflows.
-
Improved Visualization: GLQL’s data agnostic presentation layer allows users to customize how query results are displayed, enabling more effective data analysis.
-
AI Integration: With its structured syntax, GLQL is positioned to integrate seamlessly with GitLab’s AI offerings, paving the way for natural language query capabilities and enhancing user productivity.
-
Extensibility: While initially focused on issues and work items, GLQL is designed with the future in mind, allowing for expansion to cover more GitLab objects and use cases as the platform grows.
Proposal
GLQL introduces a versatile model designed to adapt a single query syntax for fetching various objects and presenting them uniformly. A GLQL query consists of two components: the query expression and metadata. The model consists of the following components:
- Façade: Acts as the entry point for processing GLQL queries, coordinating the flow between various components of the system.
- Query Compiler: Responsible for parsing and transforming the GLQL query into
executable code. It consists of three sub-components:
- Parser: Breaks down the raw query string into an AST (abstract syntax tree).
- Analyzer: Validates the parsed query, checking for semantic correctness.
- Code Generator: Converts the analyzed query into executable code for the Executor.
- Metadata Parser: Processes the metadata associated with the query, which includes
transformation (or aggregation) and presentation instructions. It has two
sub-components:
- JS YAML: Parses YAML-formatted metadata.
- Fields Parser: Interprets the fields specified in the metadata and transformations to apply, if any.
- Executor: Runs the compiled query against the API or datastore to retrieve the requested data. Currently, GraphQL is used as a datastore.
- Transformer: Processes the response data returned by the Executor, applying any transformations specified in the metadata.
- Presenter: Presents the final transformed data using data-agnostic Vue components.
Here’s an example of a GLQL query:
```glql
---
display: list
fields: title, health, due, labels("workflow::*"), labels
limit: 5
---
project = "gitlab-org/gitlab" AND assignee = currentUser() AND opened = true
```
In the above example, the front matter block enclosed in ---
contains metadata.
display
is an example of presenter metadata, and fields
and limit
are examples of
compiler metadata. Using functions in metadata values allow fields
to be used as
transformer metadata.
Design and implementation details
graph TD Query[/GLQL Query/] --> Facade[Façade] subgraph Compiler[Query Compiler] Parser[Parser] --> Analyzer[Analyzer] Analyzer --> Codegen[Code Generator] end subgraph Metadata[Metadata Parser] JSYAML[JS YAML] FieldsParser[Fields Parser] end Facade -- "query" --> Compiler Facade -- "metadata" --> Metadata Metadata -- "compiler metadata" --> Compiler Compiler -- "compiled query" --> Executor[Executor] Executor <--> DB[(API or Datastore)] Executor -- "response data" --> Transformer[Transformer] Metadata -- "transformer metadata" --> Transformer Transformer -- "transformed data" --> Presenter[Presenter] Metadata -- "presenter metadata" --> Presenter Presenter --> Result((Result))
Façade
Serving as the entry-point to parsing, executing and presenting a GLQL query, the façade component consists of two parts: the model and the view. The model part does all the hard work, while the view part is an abstraction to handle loading and error states, and present the data using the appropriate root presenter component in case of success.
Query Compiler
The query compiler is built with Haskell and is hosted here. It is built using Haskell Stack and is compiled to WASM, which is then deployed to @gitlab/query-language on NPM.
Syntax:
The query syntax consists primarily of logical expressions. These expressions follow the
syntax of [fieldName] [<|>|=|!=|in] [value|function] [AND] [anotherExpression]
.
Currently supported field names to query by include: assignee
, author
, label
,
epic
etc. Support for more fields will be added in later stages.
- Supported comparison operators:
<
,>
,=
,!=
,in
. - Supported logical operators: Currently only
AND
operator is supported. - Supported operands: Currently only
currentUser()
andtoday()
are supported. Some fields likemilestone
anditeration
support dynamic values likenone
,any
,current
orupcoming
.
Components:
The query compiler is built from the ground up to support multiple targets of output. Its components include:
- Parser: This includes a combinator parser that converts a string, including the query, into an
abstract syntax tree. The AST is optimized by extracting global attributes like
project
orgroup
. - Analyzer: A static analyzer that analyzes the query for semantic correctness.
- Code Generator(s): Converts the analyzed query into executable code for the Executor, depending on the target. For GraphQL target, it generates a GraphQL query.
Examples:
-
All bugs created for my stage in the last 28 days
label = ("devops::plan", "type::bug") and created > -28d
-
Open issues assigned to me that haven’t been updated in a week
status = "opened" and assignee = currentUser() AND updated < -7d
-
Issues late in the build stage of the upcoming milestone
milestone = upcoming and label in ("workflow::in review", "workflow::verification")
Metadata Parser
Metadata for a GLQL query is written in a combination of YAML and a special syntax to allow transformations to happen on top of the query result. The metadata parser consists of the following:
- JS YAML: Converts a YAML block into a JSON config.
- Fields Parser: A combinator parser to allow dynamically creating new fields in the transformer later.
Syntax: Currently supported options:
display
: How to display the data. Currently supported options:table
,list
ororderedList
. Default:table
.limit
: How many items to display.fields
: A comma separated value of fields. If not provided, onlytitle
field is included by default.
The fields option may also include functions to derive new columns. For example:
labels("workflow::*")
can be used to derive a new column with just workflow labels
extracted.
This capability to generate derived columns allows us in the future to derive custom
computed columns, that are a function of existing custom fields. For example:
div(mult(reach, impact, confidence), effort) as "RICE Score"
.
Executor
The executor is a simple module that executes the query compiled by the query compiler for the target platform, which is currently GraphQL.
Finally, the executor executes the GraphQL query and returns back the response data and the parsed YAML front matter config.
Transformer
The transformer module is responsible for two things:
- Normalizing the data for the presenter.
- Transforming data as requested by the user, for example, to derive new columns.
In a future iteration, the transformer module can also be used to aggregate data.
The transformer takes response data returned by the executor and some metadata, and generates a transformed output that can be passed on to the presenter.
Presenter
The presenter takes the output of the query transformed by the transfomer and presents it by selecting appropriate presenter components.
The presenter presents the root object according to the display parameter provided (table or list), and represents the values of fields returned by the executor using appropriate object presenters. Each presenter is a Vue component that takes the data it needs to present and optionally a config object containing presentation options. The presenter module makes sure all data and fields are presented using appropriate presenters.
The result of the GraphQL query is recursively mapped to a presenter.
Currently supported presenters:
Data presenters: This includes List and Table presenters. They represent a collection of, say, issues, as a list or table of data.
Both presenters include a config prop containing fields and caption telling the presenter what fields to display, and an optional caption to render alongside the presented data.
Object presenters: These are of two types:
- Generic object presenters: This includes presenters for generic data types like
Null,
Text,
Bool,
Time,
Link and
Collection.
- Null: Displays null values as None. Useful for fields that have no data set, like
health
, ordue
. - Text: For all text or numeric fields like
title
orweight
. - Bool: Represents true or false values as Yes or No. Applies to fields like
confidential
. - Time: Represents all non-null time field values like
created
orupdated
as X days ago, or in X days. - Link: For objects that don’t have presenters yet but have a
title
field and awebUrl
orwebPath
fields available, it is represented as a link. - Collection: When a field contains a collection of other objects, for example
assignees
orlabels
, it is represented as a space separated list of presenters.
- Null: Displays null values as None. Useful for fields that have no data set, like
- GitLab reference presenters: This includes presenters for various gitlab objects like
Label,
Issue,
Milestone,
State,
User and
Health.
- Label: Presents a GitLab group or project label with its appropriate colors with a link to issues in that label in the current project.
- Issue: Presents an issue like it is rendered in Markdown: with its title, link and issue ID, and a popover containing additional information of the issue.
- Milestone: Similar to the issue presenter, presenting a milestone, with its title, link and a popover containing additional information.
- State: Presents the issue status, whether open or closed, as a badge.
- User: Presents a user’s username with a link to their profile, and a popover containing more information about the user.
- Health: Presents the issue’s health status: either of
on track
,needs attention
orat risk
as a badge.
Extensibility
GLQL’s architecture is designed with extensibility as a core principle, allowing for seamless integration of new features and capabilities. This modular design enables the system to evolve with GitLab’s growing needs while maintaining consistency and reliability.
Key Extensibility Features
-
New Data Sources
- To support new APIs or datastores:
- Implement new Code Generators tailored to the specific data source
- Develop corresponding transformers to normalize the data returned by the API
- Existing Parser and Presenter components can be reused, ensuring consistency in query input and output formats
- To support new APIs or datastores:
-
New Query Objects
- To add support for querying new GitLab objects:
- Create new Analyzers to validate object-specific query semantics
- Implement dedicated Code Generators for the new API or data source
- Develop specialized Transformers to process the new object types
- Extend or create new Presenters if unique display formats are required
- To add support for querying new GitLab objects:
-
Component Modularity
- Each component (Parser, Analyzer, Code Generator, Transformer, Presenter) is designed as a standalone module
- This modularity allows for:
- Independent updates and improvements to individual components
- Easy replacement or addition of components without affecting the entire system
-
Metadata Extensibility
- The Metadata Parser can be extended to support new metadata fields, display formats or functions.
- This allows for the introduction of new query options or presentation styles without altering the core query language
Portability
GLQL’s current implementation as a frontend-only solution demonstrates its initial focus on user interaction and rapid prototyping. However, the system’s architecture is designed with portability in mind, allowing for future adaptations and optimizations.
Current Implementation
- Compiler: Written in Haskell, leveraging its strong type system and pattern matching capabilities
- Other modules (e.g., Transformer): Implemented in JavaScript for frontend integration
Future Portability Options
-
Language Migration
- The system can be ported to more widely adopted languages such as Rust or Go
- Benefits:
- Improved performance and memory management
- Broader developer ecosystem for maintenance and contributions
- Potential for both frontend (using WASM) and backend implementations
-
Backend Integration
- GLQL can be integrated with GitLab’s Ruby on Rails backend
- Advantages:
- Direct access to GitLab’s data models and business logic
- Reduced network overhead for complex queries
- Enhanced security and access control management
-
Cross-Platform Compatibility
- The modular design allows for implementation across various platforms:
- Web browsers (current implementation)
- Command-line interfaces
- IDE plugins
- The modular design allows for implementation across various platforms:
By prioritizing portability in its design, GLQL ensures long-term flexibility and adaptability, allowing it to evolve alongside GitLab’s architecture and meet emerging performance and integration requirements.
51d40dc4
)