Improving Time to First Byte (TTFB) using iterators

In this post I will explain how we were able to improve the performance of our website by using iterators. In the first part I will show how we used an iterator to improve the time to first byte (TTFB) for some specific pages. In the second part I will show how we modified our iterator to improve the largest contentful paint (LCP) of those pages.

Part 1: improving TTFB

The time to first byte (TTFB) is an important metric to determine a website’s responsiveness. In general, web.dev recommends a TTFB of 0.8 seconds or less.
Of course we followed the general recommendations for improving TTFB but even after that we noticed that especially the search engine result pages (SERP) had a relatively bad TTFB. In order to find out what was causing this poor performance, we added logging in several places and were quickly able to find out exactly where most of the processing time was spent.

Like most dynamic web sites, we use a template engine to render the html. In the following diagram (which is of course a very much simplified version of what is actually going on) you can see which methods are being executed in order to render a search engine result page:

As you can see, the RequestController is being called to serve the request. It figures out which page is being requested and collects all data required to render that page. As soon as all data has been collected the execution is passed to the Renderer which is when the first bytes are sent back to the client.

When a product search page is being requested the SolrClient is called to get the first 40 results to be shown on the requested search result page. Since our Solr instance does not actually store the complete product objects, only the product ids are used from the search result. The next step is to have the ProductService load the 40 product objects using those 40 product ids. 

This is where most of the time is spent. In our case, loading those 40 products typically takes anywhere between 100ms and 400ms. Not only does this vary (by a lot), but it is obviously too much processing time before the first byte. In comparison, fetching search results from Solr takes only about 20ms. 

Loading those 40 product objects takes that much time because they are complex objects. Retrieving a single product from the database takes a few queries in different database tables. On top of that, there are 2 layers of object cache. But with a product catalog of over a million products, a search query can easily yield results that are not in any cache and have to be retrieved from the database. 

Since the process of loading products from the database is already very much optimized using combined queries, there was no way to further optimize this. The only way forward… was to load nothing at all. At least, not before TTFB. After all, the actual products are not yet needed before the template can start rendering and the first bytes of HTML can be sent to the user. 

The first idea to optimize the TTFB was to change two things: 

  1. Pass only a list of product ids to the template
  2. Have the template load the products at the time they are needed

This approach had a number of drawbacks:

  • This required building a custom template function and explaining template developers how and when to use this
  • The whole operation requires coordination of when to change the templates and when to stop loading products before TTFB
  • We did not have a clear view which templates had to be changed

Especially that last one was a deal breaker. Our website uses search results on many places. For instance on the home page or on category pages showing the first 10 results of a search request for those categories. Those inline search results use the same data structure as the normal search result page. So in order to eliminate loading products before TTFB, we would have to change a lot of templates. This would have required even more coordination of when to have templates changed and when to stop loading products before TTFB.

Ideally the solution would not require to change anything in any template. This is why using a custom iterator is such a valuable solution. No template has to be changed and the application developer can control the entire change.

So what exactly is an iterator? 

In simple terms: it is an object that implements a predefined way of iterating over it. The actual implementation differs from one programming language to another, but in pretty much any modern language you can work with custom iterators. I say custom because all of these languages already come shipped with standard iterators. Those standard iterators are used when iterating over a simple list or array. These are so common that you probably never realized you are actually already using iterators all the time. 

Template engines in Java (JSP, Velocity, Freemarker), PHP (Twig, Blade), Python (DTL, Jinja) all support working with iterators in their loop or foreach directives. All of these template engines will keep working if you drop-in replace an existing list or array with an iterator. Or, depending on the language, an iterable object.

In our case, the actual solution was to change the existing list of products by a custom ProductList. This is a custom class that implements the Collection interface so it can be used in a loop directive in our templates. The list initially only holds 40 product ids and a reference to the ProductService where it can retrieve the actual product objects from. As soon as any template starts iterating over the list, it will check if the 40 products have been loaded already and it will load them if necessary. This allows the ProductList to be iterated over multiple times. The products will be loaded only once, but no sooner than actually required. This is also known as ‘lazy loading’.

When loading the 40 products is no longer needed before TTFB but done implicitly by the templates when they start iterating over the ProductList, the process now looks more like the following diagram:

As you can see, the time-consuming getProducts() method is no longer called before TTFB. It is now being called from inside the rendering process whenever a template starts iterating over the ProductList for the first time. This will cause the call to render() to take more time than before, but the TTFB will be a lot better than before.

In the next part I will explain how we were able to improve the Largest Contentful Paint (LCP) metric for the same search result page by improving the inner workings of this ProductList.