Notes on the Critical Rendering Path (CRP)

Published: Sun Mar 13 2022

The critical rendering path (CRP) refers to the path that a browser takes to render the resources it receives (HTML, CSS, JavaScript) to pixels on the screen.

Why is it important?

As the name suggests, CRP is about how content is rendered, i.e. displayed on a page, which directly affects usability. The more optimised the CRP is, the more performant a page is and the less chance there is for janky experiences and other rendering issues.

What goes on in the render path?

To go from HTML to screen pixels, the browser needs to do the following:

  1. Construct DOM
  2. Construct CCSOM
  3. Run JavaScript
  4. Construct Render Tree
  5. Layout
  6. Paint

Constructing the DOM

Once the browser has received HTML markup over the network in encoded byte format, it will begin parsing it.

The DOM defines the properties and relationships of the markup that the browser received. To construct the DOM, the browser performs:

  • Conversion: encoded bytes (010101...) → characters (e.g. <html>)
  • Tokenizing: characters → tokens (e.g. StartTag: html)
  • Lexing: tokens → nodes (e.g. html → head, html → body...)
  • DOM construction: tree formed from node objects (e.g. HTMLHtmlElementHTMLHeadElement, HTMLHtmlElementHTMLBodyElement...)

So, the browser parses the HTML top-down and constructs a DOM tree. It identifies the relationships between tags that are defined in the markup and builds the tree from the root element (html) by appending parent nodes along with their children (e.g. html → body) with all their attributes until the tree is complete.

The parses constructs the DOM incrementally. That is, the browser can display parts of HTML markup without having parsed all of it. Here’s an example:

<!DOCTYPE html>
<html lang="en">
  <body>
	<!–– this is parsed and displayed ––>
    <span>early content<span>
    <!––
            render blocking declaration
            rendering halted, script downloaded & executed
    ––>
    <script src="someScript.js"></script>
    <!–– DOM construction resumed ––>
    <span>later content<span>
  </body>
</html>

This is so that content is displayed ASAP. The browser will begin the construction of the render from the moment the parsing begins and displays these parts of content as the the process continues.

Other notes on HTML parsing:

  • HTML is forgiving (invalid markup does not fail parsing) as opposed to XML — does implicit insertion of missing markup, e.g. closing open tags
  • HTML cannot be parsed easily by conventional (e.g. XML) parsers since its grammar is not context free
  • HTML parsing can be halted (due to latencies, links, styles, etc.)
  • Scripts halt parsing — unless deferred with ‘defer’ or parsed and executed asynchronously with ‘async’
  • CSS also halts parsing due to the potential of a script relying on style information
  • DOMContentLoaded is fired when main parsing has fully completed

Speculative parsing

Some browsers are capable of speculative parsing, i.e. while parsing is halted by e.g. script execution, the browser spawns another thread to look ahead in the markup and load external assets while main parsing is halted. This secondary parser goes by different names:

  • Chrome and Safari have a ‘preload scanner’
  • Firefox has a ‘speculative parser’

Only external assets can be preloaded and they include (at least):

  • scripts
  • external CSS
  • images

The browser performs speculative parsing automatically.

Preloading

Speculative parsers do not preload everything. Assets that are critical for user experience can be loaded ASAP using <link rel="preload"> that is placed in the head section of the markup. Preloading is supported by most browsers.

MDN provides a list of content types that can be preloaded. They include:

  • fonts
  • scripts
  • styles

Constructing the CSSOM

The browser coalesces CSS style declarations from all provided sources (e.g. external or inline) and uses those to construct the CSS object model (CSSOM). To construct the CSSOM, the browser’s CSS parser performs conversion, tokenizing, lexing just as with the DOM, but produces a CSSOM tree in the end. The nodes in the tree specify the element and its styles.

Notes:

  • CSS, like HTML, is render blocking. If it weren’t, usability issues would arise due to missing layout and other style effects until the CSSOM is constructed, e.g. ‘Flash of Unstyled Content’ (FOUC) incidents
  • Since CSS is render blocking it should be delivered first for better UX, e.g. fonts should be preloaded
  • CSS, unlike HTML, is a context free grammar — can be parsed by conventional parsers
  • CSSOM needs to have tree structure since the rules cascade down — the browser starts with a general rule (e.g. a font-size declaration on the body) and recursively refine the styles with more specific rules as it traverses the tree (e.g. a font-size declaration on a paragraph)
  • If no CSS is provided by the user, the browser defaults to the user agent stylesheet — the (W3C CSS standard) default styles that browsers ship with. Any user-specified styles override the defaults in the final constructed CSSOM

Constructing the Render Tree

The render tree is formed by combining the DOM and CSSOM. It looks at every DOM node starting from the root and determines which CSS rules should be applied to it. The resulting tree contains rectangles (corresponding to the node’s CSS box) with all the necessary visual attributes, which represent what will eventually be displayed on the page.

Notes:

  • The render tree construction already begins during DOM tree construction
  • Render tree elements are also known as frames or render objects or renders
  • Render objects do not correspond with DOM elements 1-1 — non-visual DOM elements (e.g. head tags) are not in the render tree
  • Any elements with hidden visibility will also be excluded from the tree
  • Render objects contain information including the computed style of the DOM node and layer information (z-index layer)

Generating Layout

With the render tree in hand, the browser can use it to calculate the geometrical layout information of each node (i.e. their size and position). Each render-tree node will be given coordinates for where in the screen it should be painted.

Notes:

  • layout refers to the first time node coordinates are determined
  • reflows are subsequent layout recalculations
  • layout/reflow is recursive — render tree is traversed starting with the body and calculations made
  • a node’s children are laid out first and then the node itself since the node position and size depends on its children
  • layout can be full or incremental — render objects that need changing are marked ‘dirty’ along with their children
  • the browser will asynchronously batch small layout changes — the tree is then traversed and any dirty trees (node and its descendants) will be re-laid out in bulk
  • some changes will trigger full layout — these include font-size changes, browser resize, accessing certain properties via JavaScript (e.g. offsetHeight)
  • dimensions for all nodes are calculated relative to the device viewport (visible part of the browser window)

There’s a (very old) Youtube video showing (an extremely slowed down) reflow in effect.

Layout trashing

The browser batches reflows whenever it can due to their cost. Changes to geometric properties properties, e.g. element width or height, causes the browser to flush any changes it has queued up for a batch reflow since it is now demanded to synchronously calculate style and layout to provide that information to the caller. Unnecessary re-calculations of these geometric properties in quick succession is known as trashing the layout. For example:

function layoutTrasher() {
  for (let i = 0; i < destEls.length; i++) {
    // originEl.offsetWidth triggers layout
    // happens on each iteration
    destEls[i].style.width = originEl.offsetWidth
  }
}

Paul Irish provides a comprehensive list of all the things that force a layout or reflow.

Layout trashing can be minimised by:

  1. avoiding for loops that cause layout and change the DOM
  2. using the DevTools Performance Panel to investigate where its happening
  3. batching your writes & reads to the DOM (via FastDOM or a virtual DOM implementation).

Modern frameworks tend to take care of layout trashing under the hood. React, for example, batches setState calls in useEffect and if there are multiple components making layout triggering calls, e.g. element.offsetWidth React re-renders only after all of those DOM reads.

Painting/Rasterisation

Finally, the visible page content is converted into pixels to be displayed on the screen. The browser does this by traversing the render tree and painting each node in it with the use of its UI backend.

Notes:

  • painting, like layout, also be full or incremental
  • first occurrence of pain is known as first meaningful paint
  • the whole process, including painting must take less than 16.67ms to complete to ensure UI smoothness
  • to improve performance, content can be moved to GPU layers. This is done automatically for elements with certain CSS properties, e.g. opacity and video and canvas elements

Further reading