Generic Bundling

Jorge Chamorro jorge at jorgechamorro.com
Thu Oct 24 06:29:01 PDT 2013


On 24/10/2013, at 04:17, Ilya Grigorik wrote:

> Hey all. Late to the discussion here, but after scanning the thread, figured it might be worth sharing a few observations... 
> 
> The fact that we have to "bundle" files at the application layer is an unfortunate limitation of HTTP 1.x protocol. Specifically, because HTTP 1.x forces us to serializes responses (actually, in practice it also forces us to serializes requests on the client since pipelining adoption has effectively failed), it means we can have up to 6 parallel transfers per origin * number of origins (aka, domain sharding). This sucks at multiple levels.. it adds unnecessary complexity to our build/deploy steps (e.g. try explaining sprites to any designer...), and it also *hurts* performance in many cases.
> 
> For details on cost/benefits of pipelining, sharding, concatenation:
> - http://chimera.labs.oreilly.com/books/1230000000545/ch11.html#HTTP11_MULTIPLE_CONNECTIONS
> - http://chimera.labs.oreilly.com/books/1230000000545/ch11.html#_domain_sharding
> - http://chimera.labs.oreilly.com/books/1230000000545/ch11.html#CONCATENATION_SPRITING
> 
> As noted in last link, concatenating large bundles is actually *the opposite* of what you want to do for performance: 
> a) we want to deliver small, granular resources, such that they can be cached, invalidated, and updated individually
> b) small resources allow incremental processing and execution
> c) small resources map to modular code and better prioritization (e.g. I need this submodule only after X operation or in Y view, etc)
> 
> In practice, (a) is a serious problem for many large sites already.. every rev of their JS / CSS bundle results in a massive (and mostly unnecessary update) - case in point, GMail team has spent an enormous amount of cycles trying to figure out how to scale this process without running a self-imposed DoS attack every time their JS asset is rev'ed (plus doing so in an efficient way for users on slow connections). Similarly, in our Google PageSpeed libraries we've dropped the "concatenate all things" strategy several years ago after we realized that it hurts perceived performance: instead we merge small files into large bundles (up to 30-50kb in size) -- even this is annoying and ideally unnecessary, and we recommend disabling all spriting / concat logic when running over SPDY. 
> 
> Long story short: we don't want large bundles. 
> 
> Also, large bundles break prioritization! To deliver good performance we want modular assets with different priority levels. This is exactly why we're working on ResourcePriorities spec: https://dvcs.w3.org/hg/webperf/raw-file/tip/specs/ResourcePriorities/Overview.html. Merging multiple files into a single blob, or a zip, breaks this model and makes the situation even worse:
> 
> a) a single byte update on any part of the bundle would force downloading the entire blob ($$$, slow for clients on slower connections, low cache hit rates)
> b) streaming sub-resources from a bundle is at best complicated, and in the worst case completely broken, which only makes the performance story even worse
> c) the entire bundle is delivered with a single priority level 
> 
> In short, pitching "zip bundling" as a performance optimization is a complete misnomer. If anything, it will only make things worse, even for HTTP 1.x clients. And with HTTP 2.0 on near horizon, the limitation in number of requests is completely removed: we have full multiplexing, prioritization, flow control.. which is exactly where we want to go if we want to accelerate page load times.
> 
> ig
> 
> P.S. HTTP 2 recommendations: http://chimera.labs.oreilly.com/books/1230000000545/ch13.html#_removing_1_x_optimizations

Hi,

You're not saying that gzipping and wise pre-fetching and parallel download of scripts don't improve page load times. Or are you?

In the equation you paint above something important is missing: the fact that there's a round-trip delay per request (even with http2.0), and that the only way to avoid it is to bundle things, as in .zip bundling, to minimize the (number of requests and thus the) impact of latencies.

And there's something else I think .zip bundling can provide that http2.0 can't: the guarantee that a set of files are cached by the time your script runs: with such a guarantee you could do synchronous module require()s, à la node.js.

Cheers,
-- 
( Jorge )();



More information about the es-discuss mailing list