ClickHouse Graph Query Engine
Plan for using ClickHouse as the primary database for graph queries and building a graph engine on top.
The deployed HTTP server (gkg-webserver) exposes a REST + MCP surface so agents can run graph queries without having to write Cypher or SQL directly. This server adds three major capabilities:
gkg-webserver) that serves queries by connecting to ClickHouse, NATS, and Gitaly to build the graph queries and serve the results.View the Graph Query Engine design document for more details on the graph query engine.
View the Intermediate Query Language design document for more details on the intermediate LLM query language.
The web server will expose endpoints for GitLab Rails to consume. This will power the following features:
/api/graph/* and /api/v1/* serve code graph workflows (symbols, references, dependencies) and namespace graph analytics. Each handler resolves the target scope (tenant/namespace/project), constructs a DatabaseQueryingService, executes parameterized SQL, and optionally enriches results with Gitaly content./mcp. The adapter shares the same query services, exposing the intermediate JSON language so agents receive both the generated SQL (for transparency) and the actual query results.gkg-webserver) runs as the query front end in deployed environments. It connects to ClickHouse in read‑only mode, ensuring the query tier cannot mutate graph state while still serving low‑latency requests across multiple replicas.flowchart LR
subgraph MCP Client
A[JSON tool call]
end
subgraph gkg-webserver
B[MCP Adapter]
C[Querying Service]
D[ClickHouse]
end
subgraph GitLab Services
E[Internal API]
F[Gitaly]
end
A --> B --> C
C --> D
C -->|Execute SQL / fetch data| D
C -->|Fetch file slices| F
B -->|Resolve project| E
crates/database, so code and namespace graphs adhere to the same table/relationship definitions.9c2b62bb)