Wednesday, May 11, 2016

Cache-Control: immutable

About one year ago our friends at Facebook brought an interesting issue to the IETF HTTP Working Group - a lot (20%!) of their transactions for long lived resources (e.g. css, js) were resulting in 304 Not Modified. These are documents that have long explicit cache lifetimes of a year or more and are being revalidated well before they had even existed for that long. The problem remains unchanged.

After investigation this was attributed to people often hitting the reload button. That makes sense on a social networking platform - show me my status updates! Unfortunately, when transferring updates for the dynamic objects browsers were also revalidating hundreds of completely static resources on the same page. While these do generate 304's instead of larger 200's, this adds up to a lot of time and significant bandwidth. It turns out it significantly delays the delivery of the minority content that did change.
Facebook, like many sites, uses versioned URLs - these URLs are never updated to have different content and instead the site changes the subresource URL itself when the content changes. This is a common design pattern, but existing caching mechanisms don't express the idea and therefore when a user clicks reload we check to see if anything has been updated.

IETF standards activity is probably premature without data or running code - so called hope based standardization is generally best avoided. Fortunately, HTTP already provides a mechanism for deploying experiments: Cache-Control extensions.

I put together a test build of Firefox using a new extended attribute - immutable. immutable indicates that the response body will not change over time. It is complementary to the lifetime cachability expressed by max-age and friends.

Cache-Control: max-age=365000000, immutable

When a client supporting immutable sees this attribute it should assume that the resource, if unexpired, is unchanged on the server and therefore should not send a conditional revalidation for it (e.g. If-None-Match or If-Modified-Since) to check for updates. Correcting possible corruption (e.g. shift reload in Firefox) never uses conditional revalidation and still makes sense to do with immutable objects if you're concerned they are corrupted.

This Makes a Big Difference

The initial anecdotal results are encouraging enough to deploy the experiment. This is purely performance, there is no web viewable semantic here, so it can be withdrawn at any time if that is the appropriate thing to do.

For the reload case, immutable saves hundreds of HTTP transactions and improves the load time of the dynamic HTML by hundreds of milliseconds because that no longer competes with the multitude of 304 responses.

Facebook reload without immutable

Facebook reload with immutable

Next Steps

I will land immutable support in Firefox 49 (track the bug). I expect Facebook to be part of the project as we move forward, and any content provider can join the party by adding the appropriate cache-control extension to the response headers of their immutable objects. If you do implement it on the server side drop me a note at mcmanus@ducksong.com with your experience. Clients that aren't aware of extensions must ignore them by HTTP specification and in practice they do - this should be safe to add to your responses. Immutable in Firefox is only honored on https:// transactions.

If the idea pans out I will develop an Internet Draft and bring it back in the standards process - this time with some running code and data behind it.