Remarks about module import

Kris Kowal kris.kowal at cixar.com
Sat Aug 23 17:41:36 PDT 2008


On Mon, Aug 18, 2008 at 1:44 PM,  <ihab.awad at gmail.com> wrote:
> Hi folks,
> The module system proposals, especially the one here --
>  http://wiki.ecmascript.org/doku.php?id=proposals:modules

Oh, a module party!  Sorry I'm late and thanks to Peter Michaux for
alerting me that I was missing out.  Ihab, if you recall, I met you
and "The Mikes" last December to talk about module systems.  I really
like the direction of this thread and thought I'd put in a couple
cents.  A lot of the things I want from a module system have already
been mentioned, so some of this is just a reiteration of some of the
great ideas that have been posed; some beg distinctions.

We do not need to preserve notions of eval, particularly that the last
statement evaluated be the module object.  I think that's a great idea
and vehicle for explicit exports, but consider this one: perhaps
modules can conceptually be constructors for capability objects that
are frozen and returned by some "require" function or, by extension,
some syntax that ultimately calls said "require" function.

Just to be explicit, I think that, if module A imports module B,
module B must have a special scope chain and context object.  Solely
by virtue of having been imported, we could distinguish it from a
legacy script, even if module A isn't a new-style module.  The context
object would be the module object itself, that you would add
attributes to in order to "provide" exports.  This would increase the
parallelism between objects and modules.  The scope chain from global
to local would be:

 * builtins object
 * module scope
 * the self module
 * an anonymous function block scope

I agree that the builtins object should not be the global window
object that we all know and love.  It should be a frozen capability
object containing JavaScript primitives that can be expected and
conveniently accessed in any programming environment: frozen versions
of String, Number, Object, (is a Map still in?) &c.  Perhaps the
browser can host a "window" module that we explicitly import.  I think
it is also reasonable for this scope to contain some additional
primitives including "log" or "print" (since we no longer have to
avoid colliding with window.print).

The module scope, essentially analogous to IMPORTS__ in Caja, should
contain the "require" function (if this function needs to be unique
for each module, since in a lambda-based implementation it would need
to implicitly be aware of the URL on which the module resides for
module-relative "require" calls).  It could also contain any imported
names from an "from module import *" style import.  This would permit
module code to retrieve these values and would also prevent malicious
modules from overwriting the client module's inner-workings.  However,
this might not be a good solution for two remaining concerns: for
security, it would not prevent a malicious module from overwriting
names imported from those modules imported before it; for
verifiability, it would make it more difficult to construct
compile-time checks for name errors.  In this respect, I recognize a
tension and am resigned to the final value judgement.

The module scope could also contain a "module" variable that refers to
the current module, plus "moduleScope", and "builtins" as deemed fit.
Also, the "moduleUrl", like "__FILE__" would be handy for
introspection.

The module itself could be in the scope chain.  This would permit
programmers to reference provided functions without explicating "this"
or "module".

And, naturally, there would need to be an empty anonymous scope chain
for "private" closure variables for the module.

That's the kind of environment I believe JavaScript should wrap around
modules when they're loaded.  I think it's also important that
module's be singleton by virtue of memoizing the ultimate "require"
function.  I say "ultimate" because this would be the function that
requires a module from it's fully qualified URL.  Which leads me to my
thoughts about the "require" function's calling conventions.

I believe that the "require" function should be a continuation, either
implicit or explicit, that yields and blocks until the module has been
loaded, or accepts a continuation as an optional argument.  I also
think that module's should be identified and loaded with URL's.  There
should be a notion of a "module root", a base URL for script paths.  I
do not think that we should not support anything like a lookup chain
of PATHs, since this would incur major performance problems as the
user agent looks and fails to find modules in each successive PATH.
There should be one, and it should be global, determined by the user
agent, perhaps deferring to a script path defined somewhere in the
HTML for browser agents.  There should also be module relative paths.
This would liberate module's from the names of the directories and
domains that contain them, increasing reuse.

The require function might also benefit from accepting a version
number, although I think it would suffice to explicate that in the URL
much like "/usr/lib/libc.so" usually symlinks to "/usr/lib/libc.so.5"
on Unices.

So, I recommend that modules be identified by URL's, although not
necessarily Strings since that might compromise static analysis again.
 I also think that, borrowing a meme from python3k, if a URL begins
with a dot, it be module relative.  Consider (where "import" stands in
for some yet to be determined keyword):

   // in <http://my.com/site.html> where the moduleRoot is the same
as the page URL by implication:
   import "window"; // moduleScope.window = require("window");
   import "http://jquery.com/jquery-2.6.js"; // moduleScope.jQuery =
require(...);
   import "./widget.js" as widget; // moduleScope.widget =
require("http://my.com/widget.js");
   from "./widget.js" import Widget; // moduleScope.widget =
require('http://my.com/widget.js').Widget;

On the topic of PATH, it occurs to me that a page could potentially
subscribe to a module root either hosted by the browser in chrome://
or potentially on a CDN like Google's AJAX modules.  That might answer
my performance concern from walking the PATH and hitting a 404,
extending page load times by a Round-trip-time for each missed module.

This leaves the issue of "bundling".  Web page authors will still need
to concatenate scripts and CSS to improve a page load's performance.
To that end, I recommend that module's have a "provide" or "register"
function, wherein they can, in a single module, provide a bundle of
module objects that they construct themselves, or declare in the same
way that they would in another file.

   provide("./widget.js", widgetModule.freeze());

   provide "./widget.js" {
   }

I'll leave it to the Ihab or Mike to comment on the security
implications of bundling; I suspect they are dire.  Perhaps only
modules in subordinate URLs can be provided by one module.  That's
another tension we should consider.

There was mention on the original wiki page of requiring module
dependencies to form a directed acyclic graph (ok, a tree).  I don't
believe this is any more necessary than in Python, where it's
desirable by not enforced.  Since module objects are singleton and
registered before a module is evaluated, modules have the option of
providing their partially completed module objects to cyclic
dependencies.

Not having a solid module system is my biggest pain-point in modern
JavaScript.  Without it, JavaScripters are relegated to using best
practices and design patterns to make their scripts more but not quite
portable, and more but not quite secure.  I've managed to make most of
these features possible in user-space JavaScript using a collection of
"naughty" practices like gratuitous use of "eval" and "with" and I
consider the sacrifice worthwhile, but something similar needs to just
be natively available for security and ubiquity.  I have great hope
for the fruit of this discussion.

Kris Kowal


More information about the Es-discuss mailing list