Improving Largest Contentful Paint (LCP) using iterators

In the previous part I explained how we used an iterator to improve the Time to First Byte (TTFB) for our search result pages which we generate using Server-Side Rendering (SSR). The way this iterator is used enables a type of lazy loading. It improves the Time to First Byte by executing an expensive call to load the result products while rendering the page instead of before rendering it. Now that the products are loaded halfway the rendering of the html, this introduces a new phenomenon: a small yet very much noticeable pause as soon as the iterator starts loading the 40 products to show on the search result page. 

This pause happens when the page looks somewhat like this:

Mockup of a website that shows a header on top and in the left column a number of filters and a large white area where the main content should appear

After the short pause, rendering continues and the products are shown on the result page:

Mockup of a website that shows a header on top and in the left column a number of filters and the main content area showing two rows with 4 products

Even though the pause is relatively short (somewhere between 100-400ms), the pause hurts an important performance metric: the largest contentful paint (LCP).

What is the ‘Largest Contentful Paint’?

According to web.dev, the Largest Contentful Paint ”marks the point in the page load timeline when the page’s main content has likely loaded—a fast LCP helps reassure the user that the page is useful”. In other words: the Largest Contentful Paint is the moment in time where the page seems finished loading. You will want this to be as quickly as possible to avoid annoying the user with a seemingly slow website. 

This means it is crucial to get as many pixels of the first screen in their final state as quickly as possible. By the ‘first screen’ I mean what the user sees without scrolling, also known as ‘above the fold’. Of course, whatever happens below the fold is also important, but not as important as what happens above it.

If the search result page pauses loading for a bit when a large part of the screen is still empty, this will negatively affect the LCP. After all; the LCP counts when most of the screen seems to have reached its final state.

This specific problem with our search result page only applies to desktop browsers. That’s because on a mobile browser the first search result is just outside the first view. Or, if you will, just below the fold. That is why the LCP on mobile devices is already much better than on desktop browsers. But since a significant part of our traffic comes from desktop browsers, this problem needed a solution.

To improve the LCP we modified the iterator that is returned when the templates start looping over the 40 products in the ProductList. 

Before the changes this is how the Renderer interacts with the ProductList:

Class interaction diagram showing that most time before the Largest Contentful Paint is used by call to getProducts()

As you can see, the first time an iterator is created for the ProductList is the moment when the 40 products are loaded using getProducts(). After that, the products can be easily iterated over by getting an iterator from an ArrayList that holds the 40 loaded products. But loading the 40 products takes up a large part of the time spent before the LCP.

Batched loading

In order to improve the LCP we want to load the first two rows of products (8 in total) as soon as possible. The remaining 32 products are less important as they are only needed below the fold. To accomplish this we want to load the products in batches of 8 products at a time.

If we want to have more control over which of those 40 products are loaded – and when to load them, we have to implement a custom iterator. 

In our case (using Java with Freemarker), when a template starts to loop over a list of items, the template engine sees that the ProductList implements the Iterable interface, so it uses the iterator() method to create a new iterator for the list. Earlier, this is where we used to load all 40 products (if not done already) and then call iterator() on the filled-up ArrayList holding the 40 products. If we want the products to be loaded in batches of 8 products at a time, we have to implement an actual iterator to do so. 

In the final solution, a custom ProductListIterator is created when the iterator() method on the ProductList is called. Upon creation, the following parameters are passed to the iterator: the ArrayList holding the 40 product keys, a reference to the ProductService where to load products and a reference to a Map holding already-loaded products. This last parameter prevents products being loaded twice when the ProductList is iterated over more than once.

The following diagram shows the final solution:

Class interaction diagram showing how the first call to next() triggers a small call to getProducts() and the next few calls to next() do not

When iterating over a list, the hasNext() and next() are called for each iteration. The call to hasNext() only peeks if there is another item to iterate over and the call to next() makes the next item ‘current’. In our custom iterator, we use an internal index to keep track of the current product. It is a zero-based integer so we can use it to get the current product’s key from the ArrayList holding the 40 product keys. 

A call to next() first increments the internal pointer. After that, the current product key is retrieved from the list of product keys. If the product for the current key is not already loaded, a batch of to-be-loaded product keys is being collected. This batch contains the current product key and the next 7 keys. Then, the products are retrieved from the ProductService and stored into the Map holding already loaded products. Finally, the current product is taken from the map to be used as return value. 

In the following 7 calls to next() the then current product will have been loaded already so its key will already exist in the Map holding already loaded products. These calls to next() will not cause any products to be loaded and will return a product immediately. 

When the next() method is called for the 9th time, another 8 products will be loaded using getProducts(). And so forth, until all 40 products have been iterated over. 

When the ProductList is iterated over for a second time, all products will have been loaded already so they will not be loaded again.

In summary, we were able to dramatically improve both the Time to First Byte (TTFB) and Largest Contentful Paint (LCP) of our search result pages by utilizing a custom iterator. And we did not have to change a single template to accomplish this.