Bundling vs sending serialized dependency graph

John Barton johnjbarton at google.com
Thu Aug 21 08:00:03 PDT 2014

On Wed, Aug 20, 2014 at 4:53 PM, Ian Hickson <ian at hixie.ch> wrote:

> On Wed, Aug 20, 2014 at 4:06 PM, John Barton <johnjbarton at google.com>
> wrote:
> > On Mon, Aug 18, 2014 at 10:43 AM, Ian Hickson <ian at hixie.ch> wrote:
> >>
> >> This just doens't work.
> >>
> >> Suppose the dependency graph looks like this:
> >>
> >>      Feature A --> Dependency A1 \__\ Dependency    \
> >>      Feature B --> Dependency B1 /  /    AB          >--> Dependency D
> >>      Feature C --> Dependency C1 ---> Dependency C2 /
> >>
> >> All of A, B, and C are to be fetched on-demand-only, to avoid using up
> >> too much bandwidth. All the files here are non-trivial in size.
> >>
> >> How do you package this?
> >>
> >> If you make a package for A, a package for B, and a package for C, then
> >> you'll have redundant content in the packages, and when the client asks
> >> for B after already having asked for A, the amount of content sent back
> >> will be greater than necessary and therefore it'll be slower than
> >> necessary. If you create multiple packages such that you group as much
> as
> >> possible into each package as possible without overlap, then you still
> end
> >> up with multiple resources to download when you need any of A, B, or C.
> >> Basically, it boils down to:
> >>
> >>      Package A \__\ Package    \
> >>      Package B /  /    AB       >--> Package D
> >>      Package C ------------->  /
> >>
> >> ...and then you're back to the problem I asked about. If you don't have
> >> server-side support, then to avoid round-trips the client needs to know
> >> about the dependencies before it makes the first request. It can't wait
> >> til it receives the packages to discover the dependencies because if you
> >> do that then you're serialising your RTTs instead of pipelining them.
> >
> > I assume you are imagining a densely connected graph with random access
> to
> > any of the roots.
> I'm not quite sure what that means.

It means that the graph of all possible modules for a page cannot be
partitioned into packages without sending much of the shared graph over the
wire multiple times.

> I mean a world where different otherwise unrelated leaf modules or
> resources depend on common shared dependencies.

Leaf modules don't depend on anything. That's what a leaf means.

> > I expect that real life pages have well partitioned graphs
> > (widgets) that share some dense parts (utilities) and simple access
> patterns
> > -- main page, utilities, a set of widgets.
> I doubt that the Web is that convenient. On some well-designed sites
> it might work out that way.
> Consider a site like Google+, though. However well-designed it is, it
> fundamentally has a lot of common files used by lots of intermediate
> shared files used by lots of leaf modules (and a lot more depth while
> we're at it).

I think your graph is upside down from mine ;-) As I learned it, leaf nodes
were the ones at the ends of branches and hence were not dependent on any
other nodes; no node depended on a root node.

> What exactly is needed depends on what posts are
> displayed, the user's preferences with respect to features like
> Hangouts, etc. It's a complicated graph.

The only issue that matters for the efficiency of bundle loading is how
many nodes are shared between commonly used dynamically loaded root
modules. If the module is needed always it will be loaded always. If the
module is only used by a single dynamically loaded feature, then there is
 no penalty for bundle loading. Even in the case where two or more dynamic
loads use the same large number of modules we can simply put those modules
in a shared bundle. So the case where bundles lose is very rare.

Here is an attempt and a graph:

  A   B  C  D   E
  | /  \ | / \ /
  R      X    Y

R is always loaded so its bundle loads A and B.
Optional feature X loads C and D but not B its already loaded.
Optional feature Y loads E and possibly DE if X did not already load.

Only D is extra work and thus this scenario is relatively rare.  Typically
optional modules will share dependencies with the default loaded page
modules or have unique modules.

> > But sure, it would be great to have a complete solution if it's not a lot
> > more complex.
> I don't think it should be particularly complex. It only requires some
> minor changes. One is that we need to be able to declare dependencies
> ahead of the "instantiate" hook.

By the way I recently discovered that the deplist returned by the
instantiate hook does not enter the dependency graph analysis. These aren't
dependencies in a list rather a list of things needed to be loaded.

> Another (a subset, really) is that we
> need to be able to declare dependencies for ES6 modules as well as
> letting the ES6 infrastructure discover them automatically.

In theory this should be straight-forward. In practice, well good luck.

> Finally,
> it would be ideal if we could also adjust those dependencies on the
> fly, since if we're reflecting dependencies described in the mutable
> DOM structure, it might be mutated.

I think this one is technically difficult.

> > I guess you're proposing the send the dependency graph to the browser,
> then
> > when a new root is needed, the stored graph is compared with the
> > currently-loaded modules. The additional modules needed are then
> requested
> > as a group. Up to this point we can just use build tools and browsers.
> Actually, modulo the changes described above, the ES6 loader already
> does all this.

Huh? How do you plan to parse the modules to obtain dependencies without
sending them to the browser?

It just doesn't quite handle it at the level of
> pre-emptive declaration of dependencies. But suppose you had two
> modules A, B, and C. A and B depend on C. With ES6 today, when A is
> loaded, it loads C. If late you load B, B doesn't reload C; it just
> links into it. So this is all already supported. All that's needed is
> a way to tell the ES6 system to get C before it has even received A.

You've really lost me now. I thought your goal was to avoid sending C over
the network. Now you want to send it without even seeing A?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140821/6754900e/attachment-0001.html>

More information about the es-discuss mailing list