Reusable Rapid Diffs (RRD)
This page contains information related to upcoming products, features, and functionality.
It is important to note that the information presented is for informational purposes only.
Please do not rely on this information for purchasing or planning purposes.
The development, release, and timing of any products, features, or functionality may be
subject to change or delay and remain at the sole discretion of GitLab Inc.
Summary
Diffs at GitLab are spread across several places with each area using their own method. We are aiming
to develop a single, performant way for diffs to be rendered across the application. Our aim here is
to improve all areas of diff rendering, from the backend creation of diffs to the frontend rendering
the diffs.
All the diffs features related to this document are listed on a dedicated page.
Work breakdown
Rapid Diffs work is split into 3 stages and can be tracked in the following epics:
- Stage 0 — foundation:
- Have foundational components in place.
- Stream diffs on MR, commit and compare revisions pages.
- Stage 1 — baseline features:
- Most of the features are working (dicussions, navigation, review, etc.)
- Stage 2 — production ready:
- Feature specs pass against Rapid Diffs
- Full accessibility compliance
Motivation
Goals
- improved perceived performance
- improved maintainability
- consistent coverage of all scenarios
Non-Goals
This effort will not:
- Identify improvements for the current implementation of diffs both in Merge Requests or in the Repository Commits
Priority of Goals
In an effort to provide guidance on which goals are more important than others to assist in making
consistent choices, despite all goals being important, we defined the following order.
Perceived performance is above improved maintainability is above consistent coverage.
Examples:
- a proposal improves maintainability at the cost of perceived performance: ❌ we should consider an alternative.
- a proposal removes a feature from certain contexts, hurting coverage, and has no impact on perceived performance or maintainability: ❌ we should re-consider.
- a proposal improves perceived performance but removes features from certain contexts of usage: ✅ it’s valid and should be discussed with Product/UX.
- a proposal guarantees consistent coverage and has no impact on perceived performance or maintainability: ✅ it’s valid.
In essence, we’ll strive to meet every goal at each decision but prioritise the higher ones.
Process
Workspace & Artifacts
- We will store implementation details like metrics, budgets, and development & architectural patterns here in the docs
- We will store large bodies of research, the results of audits, etc. in the wiki of the RRD project
- We will store audio & video recordings on the public YouTube channel in the Code Review / RRD playlist
- We will store drafts, meeting notes, and other temporary documents in public Google docs
Proposal
The new approach proposed here changes what we have done in the past by doing the following:
- Stop using virtualized scrolling for rendering diffs.
- Move most of the rendering work to the server.
- Enhance server-rendered HTML on the client.
- Unify diffs codebase across all pages rendering diffs (merge request, repository commits, compare revisions and any other).
Definitions
Maintainability
Maintainable projects are simple projects.
Simplicity is the opposite of complexity. This uses a definition of simple and complex described by Rich Hickey in “Simple Made Easy” (Strange Loop, 2011).
- Maintainable code is simple (single task, single concept, separate from other things).
- Maintainable projects expand on simple code by having simple structure (folders define classes of behaviors, e.g. you can be assured that a component directory will never initiate a network call, because that would be conflating visual display with data access)
- Maintainable applications flow out of simple organization and simple code. The old saying is a cluttered desk is representative of a cluttered mind. Rigorous discipline on simplicity will be represented in our output (the product). By being strict about working simply, we will naturally produce applications where our users can more easily reason about their behavior.
Done
GitLab has an existing definition of done which is geared primarily toward identifying when an MR is ready to be merged.
In addition to the items in the GitLab definition of done, work on RRD should also adhere to the following requirements:
- Meets or exceeds all metrics
- Meets or exceeds our minimum accessibility metrics (these are explicitly not part of our defined priorities, because they are non-negotiable)
- All work is fully documented for engineers (user documentation is a requirement of the standard definition of done)
Acceptance Criteria
To measure our success, we need to set meaningful metrics. These metrics should meaningfully and positively impact the end user.
- Meets or exceeds WCAG 2.2 AA.
- Meets or exceeds ATAG 2.0 AA.
- The RRD app loads less than or equal to 300 KiB of JavaScript (compressed / “across-the-wire”)1.
- The RRD app loads less than or equal to 150 KiB of markup, images, styles, fonts, etc. (compressed / “across-the-wire”)1.
- The Time to First Diff (
mr-diffs-mark-first-diff-file-shown
) happens before 3 seconds mark.
- The RRD app can execute in total isolation from the rest of the GitLab product:
- “Execute” means the app can load, display data, and allows user interaction (“read-only”).
- If a part of the application is only used in merge requests or diffs, it is considered part of the Diffs application.
- If a part of the application must be brought in from the rest of the product, it is not considered part of the Diffs load (as defined in metrics 3 and 4).
- If a part of the application must be brought in from the rest of the product, it may not block functionality of the Diffs application.
- If a part of the application must be brought in from the rest of the product, it must be loaded asynchronously.
- If a part of the application meets 5.1-5.5 (such as: the Markdown editor is loaded asynchronously when the user would like to leave a comment on a diff) and its inclusion causes a budget overflow:
- It must be added to a list of documented exceptions that we accept are out of bounds and out of our control.
- The exceptions list should be addressed on a regular basis to determine the ongoing value of overflowing our budget.
1: The Performance Inequality Gap, 2023
Frontend
Ideally, we would meet our definition of done and our accountability metrics on our first try.
We also need to continue to stay within those boundaries as we move forward. To ensure this,
we need to design an application architecture that:
- Is:
- Scalable.
- Malleable.
- Flexible.
- Considers itself a mission-critical part of the overall GitLab product.
- Treats itself as a complex, unique application with concerns that cannot be addressed
as side effects of other parts of the product.
- Can handle data access/format changes without making UI changes.
- Can handle UI changes without making data access/format changes.
- Provides a hookable, inspectable API and avoids code coupling.
- Separates:
- State and application data.
- Application behavior and UI.
- Data access and network access.
Design and implementation details
Overview
Reusable Rapid Diffs introduce a change in responsibilities for both frontend and backend.
The backend will:
- Prepare diffs data.
- Highlight diff lines.
- Render diffs as HTML and stream them to the browser.
- Embed diffs metadata into the final response.
The frontend will:
- Enhance existing and future diffs HTML.
- Handle streamed diffs HTML.
- Enhance diffs HTML with dynamic controls to enable user interaction.
Static and dynamic separation
To achieve the separation of concerns, we should distinguish between static and dynamic UI on the page:
- Everything that is static should always be rendered on the server.
- Everything dynamic should be enhanced on the client.
Data that should be coming with the page:
- Static diff file metadata: viewer type, added and removed lines, etc.
- Edit permissions
Data that should be served through additional requests:
- Discussions
- File browser tree
- Line expansion HTML
- Full file HTML
- Code quality
- Code coverage
- Everything else
We should return HTML for line expansion and view full file features.
Other requests should return normalized data in JSON format.
Code suggestion feature should use the existing HTML of the diff, similar to the current implementation.
To improve the perceived performance of the page we should implement the following techniques:
- Limit the number of diffs rendered on the page at first.
- Use HTML streaming
to render the rest of the diffs.
- Use Web Components to hook into diff files appearing on the page.
- Apply
content-visibility
whenever possible to reduce redraw overhead.
- Render diff discussions asynchronously.
Page & Data Flows
These diagrams document the flows necessary to display diffs and to allow user interactions and user-submitted data to be gathered and stored.
In other words: this page documents the bi-directional data flow for a complete, interactive application that allows diffs to display and users to collaborate on diffs.
Critical Phases
- Gitaly
- Database
- Diff Storage
- Cache
- Back end
- Web API
- Front end*
flowchart LR
Gitaly
DB[Database]
Cache
DS[Diff Storage]
FE[Front End]
Display
Gitaly <--> BE
DB <--> BE
Cache <--> BE
DS <--> BE
BE <--> API
API <--> FE
FE --> Display
subgraph Rails
direction LR
BE[Back End]
API[Web API]
end
*: Front end obscures many unexplored phases. It is likely that the front end will need caches, databases, API abstractions (over sub-modules like network connectivity, etc.), and more. While these have not been expanded on, “Front end” stands in for all of that complexity here.
Gitaly
For fetching Diffs, Gitaly provides two basic utilities:
- Retrieve a list of modified files with associated pre- and post-image blob IDs for a set of revisions.
- Retrieve a set of Git diffs for an arbitrary set of specified files using pre- and post-image blob IDs.
sequenceDiagram
Back end ->> Gitaly: "What files were modified between<br />this pair of/in this single revision?"
Gitaly ->> Back end: List of paths
Back end ->> Gitaly: "What are the diffs for this set of paths<br /> between this pair of/in this single revision?"
Gitaly ->> Back end: List of diffs
Database
sequenceDiagram
Back end ->> Database: What are the file paths for a known MR version?
Database ->> Back end: List of paths
Cache
sequenceDiagram
Back end ->> Cache: Give me the diff template for scenario XYZ
Cache ->> Back end: Static template to render diff in scenario XYZ
- Repeated render of a diff
sequenceDiagram
Back end ->> Cache: Give me the compiled UI for diff ABC123
alt Cache miss
Cache ->> Back end: ☹️
Back end ->> Cache: Cache the compiled UI for diff ABC123
else
Cache ->> Back end: Existing compiled diff UI
end
Diff Storage
sequenceDiagram
Back end ->> Diff Storage: Give me the raw diff of this file
Diff Storage ->> Back end: Raw diff
Backend
- First files rendered on page load
sequenceDiagram
participant Client
participant Back end
participant Authorization
participant HAML
participant Cache
participant Database
participant Diff storage
participant Gitaly
Client ->> Back end: Page load request
Back end ->> Authorization: Check is good request
alt Unauthorized
Authorization ->> Back end: No!
Back end ->> Client: 403 or 404
else
Authorization ->> Back end: Authorized.
alt MR Diff
Back end ->> Database: Get N files
Database ->> Back end: Files
Back end ->> Diff storage: Get diffs of N files
Diff storage ->> Back end: Diffs
else
Back end ->> Gitaly: Get diffs of N files
Gitaly ->> Back end: Diffs
end
loop Iterate through each diff file
Back end ->> HAML: Render diff file
HAML ->> Cache: Give me the cached rendered UI per file
alt Cache miss
Cache ->> HAML: Nada!
HAML ->> Cache: Cache rendered UI per file
Cache ->> HAML: Cached, rendered UI per file
else
Cache ->> HAML: Cached, rendered UI per file
end
HAML ->> Back end: Rendered UI
end
Back end ->> Client: Respond with application layout with rendered UI
end
- Future files rendered and streamed to the front end
sequenceDiagram
participant Client
participant Back end
participant Authorization
participant HAML
participant Cache
participant Database
participant Diff storage
participant Gitaly
Client ->> Back end: Stream request
Back end ->> Authorization: Check is good request
alt All the possible unhappy paths
Authorization ->> Back end: No!
Back end ->> Client: 403
else
Authorization ->> Back end: Authorized.
alt MR Diff
Back end ->> Database: Get files
Database ->> Back end: Files
Back end ->> Diff storage: Get diffs
Diff storage ->> Back end: Diffs
else
Back end ->> Gitaly: Get diffs
Gitaly ->> Back end: Diffs
end
loop Iterate through each diff file
Back end ->> HAML: Render diff file
HAML ->> Cache: Give me the cached rendered UI per file
alt Cache miss
Cache ->> HAML: Nada!
HAML ->> Cache: Cache rendered UI per file
Cache ->> HAML: Cached, rendered UI per file
else
Cache ->> HAML: Cached, rendered UI per file
end
HAML ->> Back end: Rendered UI
end
Back end ->> Client: Stream rendered UI per file
end
Web API
The Web API provides both internal and public access to the back end implementation for diffs.
Eventually, this diagram should expand (and possibly split) to show each endpoint that our application or a user could interface with, and what each of those endpoints expects and returns.
Note that this is separate from the Back End diagrams, which elaborate on business logic and implementation details.
The API endpoints are consumer-facing and so have different requirements and structures.
sequenceDiagram
actor Web User
participant Endpoints
participant Back end
Web User ->> Endpoints: Give me the diff for [x] file
Endpoints ->> Back end: User [u] is requesting [x] diff
Back end ->> Endpoints: Here is the resolved, rendered UI for that diff
Endpoints ->> Web User: "Do with this diff whatever you'd like to"
A complete, single render
sequenceDiagram
actor User
participant UI
participant UX as Interaction handlers
participant FeApp as Front end behaviors
participant FeData as Data abstraction
participant FeNet as Network connectivity
participant API as Web API
participant BE as Back end
participant xxx
participant Cache
participant Database
participant Gitaly
User -->> BE: (MR page load)
BE ->> xxx: ???
xxx ->> Cache: ???
Cache ->> xxx: ???
xxx ->> Database: ???
Database ->> xxx: ???
xxx ->> Gitaly: ???
Gitaly ->> xxx: ???
xxx ->> BE: Rendered HTML
BE ->> User: A rendered diffs page for the MR
Accessibility
Reusable Rapid Diffs should be displayed in a way that is compliant with Web Content Accessibility Guidelines 2.1 level AA for web-based content and Authoring Tool Accessibility Guidelines 2.0 level AA for user interface.
We recognize that in order to have an accessible experience using diffs in the context of GitLab, we need to ensure the compliance both for displaying and interacting with diffs. That’s why the accessibility
audit and further recommendation will also consider Content Editor used feature for reviewing changes.
ATAG 2.0 AA
Giving the nature of diffs, the following guidelines will be our main focus:
- Guideline A.2.1: (For the authoring tool user interface) Make alternative content available to authors
- Guideline A.3.1: (For the authoring tool user interface) Provide keyboard access to authoring features
- Guideline A.3.4: (For the authoring tool user interface) Enhance navigation and editing via content structure
- Guideline A.3.6: (For the authoring tool user interface) Manage preference settings
HTML structure
The HTML structure of a diff should have support for assistive technology.
For this reason, a table could be a preferred solution as it allows to indicate
logical relationship between the presented data and is easier to navigate for
screen reader users with keyboard. Labeled columns will make sure that information
such as line numbers can be associated with the edited piece of code.
Possible structure could include:
<table>
<caption class="gl-sr-only">Changes for file index.js. 10 lines changed: 5 deleted, 5 added.</caption>
<tr hidden>
<th>Original line number: </th>
<th>Diff line number: </th>
<th>Line change:</th>
</tr>
<tr>
<td>1234</td>
<td></td>
<td>.tree-time-ago ,</td>
</tr>
[…]
</table>
See WAI tutorial on tables for
more implementation guidelines.
Each file table should include a short summary of changes that will read out:
- total number of lines changed,
- number of added lines,
- number of removed lines.
The summary of the table content can be placed either within <caption>
element, or before the table within an element referred as aria-describedby
.
See WAI (Web Accessibility Initiative) for more information on both approaches:
However, if such a structure will compromise other functional aspects of displaying a diff,
more generic elements together with ARIA support can be used.
Visual indicators
It is important that each visual indicator should have a screen reader text
denoting the meaning of that indicator. When needed, use gl-sr-only
(in conjunction with focus:gl-not-sr-only
if needed)
class to make the element accessible by screen readers, but not by sighted users.
Some of the visual indicators that require alternatives for assistive technology are:
+
or red highlighting to be read as added
-
or green highlighting to be read as removed
High-level implementation
Alternative Solutions
Historical context
Reusable Rapid Diffs introduce a paradigm shift in our approach to rendering diffs. Before this proposed architecture, we had two different approaches to rendering diffs:
- Merge requests heavily utilized client-side rendering.
- All other pages mainly used server-side rendering with additional behavior implemented in JavaScript.
In merge requests, most of the rendering work was done on the client:
- The backend would only generate a JSON response with diffs data.
- The client would be responsible for both drawing the diffs and reacting to user input.
This led to us adopting a
virtualized scrolling solution
for client-side rendering, which sped up drawing large diff file lists significantly.
Unfortunately, this came with downsides of a very high maintenance cost and
constant bugs.
The user experience also suffered because we couldn’t show diffs right away
when you visited a page, and had to wait for the JSON response first.
Lastly, this approach went completely parallel to the server-rendered diffs used on other pages,
which resulted in two completely separate codebases for the diffs.
Summary of the alternative solutions attempted
Here is a list of the strategies we have adopted or simply tested in the past:
- Full Server Side Rendering (adopted and replaced by Vue app): before the Vue refactor of the Merge Request Changes tab, diffs were fully rendered on the server. This resulted in long waits before the page started to render.
- Frontend templates (Vue) Server Side Rendered (tested): results and impact weren’t compelling and pointed in the direction of partial SSR. (PoC MR)
- Batch diffing (adopted): Break up the diffs into async paginated requests, increasing in size (slow start). Bootstrapping time unsatisfactory, perceived performance still involved a long time of a page without content.
- Virtual Scrolling (adopted): several known side-effects like inability to fully use native search functionality, interferences and weird behavior while scrolling to elements, overall strain on the browser to keep reflowing and painting. (Comparison with the proposed approach in this blueprint)
- Repository Commits details paginated if too large (adopted): As an interim solution, really large commit diffs in the repository are now paginated with negative impact in UX, hiding away files and changes through multiple pages.
- Micro Code Review Frontend PoC (tested): This approach was significantly different from the application design used in the past, so it was never seriously explored as a way forward. Parts of this design - like custom elements and a reliance on events - have been incorporated into alternative approaches. (Micro Code Review Frontend PoC)
- Streaming Diffs using a node server (tested): Combines streaming with a dedicated nodejs server. Percursor to the proposed SSR approach in this blueprint. (PoC: Streaming diffs app)
Proposed changes
These changes (indicated by an arbitrary name like “Design A”) suggest a proposed final path forward for this blueprint, but have not yet been accepted as the authoritative content.
- Mark the highest hierarchical heading with your design name. If you are changing multiple headings at the same level, make sure to mark them all with the same name. This will create a high-level table of contents that is easier to reason about.
Front end (Design A)
High-level implementation
NOTE:
This draft proposal suggests one potential front end architecture which may not be chosen. It is not necessarily mutually exclusive with other proposed designs.
(See New Diffs: Technical Architecture Design for nicer visuals of this chart)
flowchart TB
classDef sticky fill:#d0cabf, color:black
stickyMetricsA>"Metrics 3, 4, & 5 apply to<br>the entire front end application"]
stickyMetricsA -.- fe
fe
Socket((WebSocket))
be
subgraph fe [Front End]
stickyMetricsB>"Metrics 1 & 2 apply<br>to all UI elements"]
stickyInbound>"All data is formatted precisely<br>how the UI needs to interact with it"]
stickyOutbound>"All data is formatted precisely<br>how the back end expects it"]
stickyIdb>"Long-term.
e.g. diffs, MRs, emoji, notes, drafts, user-only data<br>like file reviews, collapse states, etc."]
stickySession>"Session-term.
e.g. selected tab, scroll position,<br>temporary changes to user settings, etc."]
Events([Event Hub])
UI[UI]
uiState((Local State))
Logic[Application Logic]
Normalizer[Data Normalizer]
Inbound{{Inbound Contract}}
Outbound{{Outbound Contract}}
Data[Data Access]
idb((indexedDB))
session((sessionStorage))
Network[Network Access]
end
subgraph be [Back End]
stickyApi>"A large list of defined actions a<br>Diffs/Merge Request UI could perform.
e.g.: <code>mergeRequest:notes:saveDraft</code> or<br><code>mergeRequest:changeStatus</code> (with <br><code>status: 'draft'</code> or <code>status: 'ready'</code>, etc.).
Must not expose any implementation detail,<br>like models, storage structure, etc."]
API[Activities API]
unk[\"?"/]
API -.- stickyApi
end
%% Make stickies look like paper sort of?
class stickyMetricsA,stickyMetricsB,stickyInbound,stickyOutbound,stickyIdb,stickySession,stickyApi sticky
UI <--> uiState
stickyMetricsB -.- UI
Network ~~~ stickyMetricsB
Logic <--> Normalizer
Normalizer --> Outbound
Outbound --> Data
Inbound --> Normalizer
Data --> Inbound
Inbound -.- stickyInbound
Outbound -.- stickyOutbound
Data <--> idb
Data <--> session
idb -.- stickyIdb
session -.- stickySession
Events <--> UI
Events <--> Logic
Events <--> Data
Events <--> Network
Network --> Socket --> API --> unk
This is an appendix to the Reusable Rapid Diffs document.
Below is a complete list of features for merge request and commit diffs grouped by diff viewers (Code, Image, Other).
✓ – available in both MR and Commit views.
Features |
Code |
Image |
Other |
Filename |
✓ |
✓ |
✓ |
Copy file path |
✓ |
✓ |
✓ |
Collapse and expand file |
✓ |
✓ |
✓ |
File stats |
✓ |
✓ |
✓ |
Lines changed (0 for blobs) |
✓ |
✓ |
✓ |
Permissions changed |
✓ |
✓ |
✓ |
CRUD comment on file |
✓ |
✓ |
✓ |
View file link |
✓ |
✓ |
✓ |
Mark as viewed |
MR |
MR |
MR |
Hide all comments |
MR |
MR |
MR |
Show full file (expand all lines) |
MR |
|
|
Open in Web IDE link |
MR |
|
|
Line link |
✓ |
|
|
Edit file link |
✓ |
|
|
Code highlight (multiple themes) |
✓ |
|
|
Expand lines |
✓ |
|
|
CRUD comment on specific line |
Commit |
|
|
CRUD comment on line range |
MR |
|
|
Draft comment on line range |
MR |
|
|
Code quality highlights |
✓ |
|
|
Test coverage highlights |
✓ |
|
|
Hide whitespace changes |
✓ |
|
|
Auto-collapse large file |
✓ |
|
|
View as raw |
Commit |
|
|
Side by side view |
|
✓ |
|