Module naming and declarations

Andreas Rossberg rossberg at google.com
Mon Apr 29 06:34:22 PDT 2013


On 26 April 2013 00:27, Sam Tobin-Hochstadt <samth at ccs.neu.edu> wrote:
> First, I appreciate you setting your thoughts down in detail.  I think
> this will help us move forward in the discussion.
>
> You write in a later message:
>
>> Having said that, interoperability with existing module systems was
>> not the main motivation for the change in the proposal, as far as I
>> can tell. It rather was simplicity (which is debatable, as I hope I've
>> made clear), and convenient support for common configuration and
>> concatenation use cases.
>
> I don't think this is right, and I think this is the heart of the
> issue. Let me back up a bit to explain this.

Thank you for your explanation. I apologize if I have misrepresented
your intention with the above. I seem to remember that it, roughly, is
the motivation one of you explained to me at some point, but my memory
may be unreliable.

Let me try to group my rebuttal to the technical content of your reply somewhat.


* The Package Manager Assumption

Frankly, I still didn't really understand from your reply how the
design hangs together until you clarified in a later answer to Kevin
that you are generally assuming the use of some package manager. This
is a fairly significant assumption, that I wished I had heard being
made explicit in any of the previous discussions. AFAICT, it is key,
and without it, your proposal cannot fly. Not in a world with tens of
thousands of module releases, anyway.

As for whether that is a good assumption to make, I have my serious
doubts. As Kevin points out, it creates a lot of open questions and
potential liabilities. In particular, I highly question whether it is
wise to make that the _exclusive_ mechanism for importing modules.

In other words, package managers are great, and the module system
should definitely support them. But it should not, de facto, _require_
them -- especially not on the web. Neither should the language
prescribe how such a manager chooses to address packages, which,
again, the current proposal does de facto.


* Naming

More specifically, everything should still play nice with standard web
mechanisms. For me, the web platform implies that I should be able to
address modules as remote resources (it's a choice _not_ to use that
mechanism). That requires that the system allows proper URLs as module
identifiers. And at that point, you absolutely want your logical names
to integrate nicely into the URL name space semantics, which
currently, they do not do at all. (The proposal calls module ids
"URLs", and syntactically, they form a sublanguage. But semantically,
this sublanguage is divorced and interpreted in a completely
incompatible way.)

That brings me to your repeated assertion that URLs are not
appropriate for logical names, because they are, well, logical, not
physical. Of course, I should have said URI, and as usual, that's what
I really meant. :)  URIs generally _are_ logical names for resources.
Arguably, the inherent abstraction from logical to physical (or
virtual) is one main reason for their existence. So, from my
perspective, URIs are _exactly_ the appropriate tool.

That does not mean that logical names have to become unreadable or
awful to type. Just that they are slightly more explicit. A schema
prefix -- say, jsp: for a random strawman -- is good enough and should
not offend anybody in terms of verbosity. (If it did, then I'd have
little hope for the evolution of the web platform. :) )


* Module declarations

There are a few issues you haven't touched at all in your reply. One
is (id-named) module declarations. What's their role in all this?

You say that lexical declarations are not precluded by your proposal.
While technically true, much of my point was that id-named module
declarations cannot be the only form of declaration without creating
serious problems. They shouldn't even be the primary form. Internal
naming, and therefor lexical declarations, are _necessary_ to make the
design complete. And once you have them, id-named declarations become
fairly moot, except that they create extra confusion.

As far as I can see, they are not relevant for anything but
concatenation. And given the assumption of a package management
system, concatenation becomes even less relevant a concern -- you'll
use a tool for that anyway (as, e.g., AMD's optimizer), and that works
perfectly well (presumably, better) on top of lexical declarations.
(And a tool handling the simple, manager-less cases for which a naive
'cat' would currently be enough is really straightforward.)

>From my POV, the discussion about package managers also reveals
another conflation in your design: namely, between _modules_ (as
individual, language-level objects), and _packages_ (a mechanism for
deploying one, or a collection of, modules). A logical naming
mechanism like the one you envision is for addressing packages, not
modules. That makes it all the more dubious to bake in external names,
which are really package names, with module definitions.


* Summary

So I stand by my proposal. We should:

1. add lexical module declarations to provide internal naming;
2. use proper URIs for external naming (including logical names);
3. get rid of id-named module declarations, which don't serve much purpose;
4. prototype a simple bundling tool to replace naive 'cat';
5. prototype a simple package manager, to verify that it actually
integrates properly.

I don't think we can ship the design without taking these steps. It
would neither be a complete nor a future-proof system.


* Some specific replies

> Module names play a role in three processes, in general:
>
> 1. As a way to identify local components.
> 2. As a way to find the physical resource that is the source code (or
> object code) of the module.
> 3. As a way for two separately developed components to coordinate
> about which module they mean.
>
> In the current design, the internal names (eg, "jquery") serve role 1,
> and URLs (as generated by the loader hooks) serve role 2. The
> coordination role is played by internal names in a shared registry.

In the current design, an import from a name like "jquery", _by
default_, is supposed to fall back to a file/URL, just like in AMD and
friends, doesn't it? By all practical means that makes it a direct
reference to a resource. Given that, I don't see any conceivable way
in which you can argue that "jquery" is an internal name in your
system. From my perspective, there are no actual internal names in the
system, which is part of the problem I'm trying to address.

Re URL, see above. Once you consider general URIs, they are also the
appropriate tool for (3).


> To pick another example (at random :), take SML with CM [1].  Here,
> lexically-bound module name serve role 1, and CM serves role 2.
> Coordination is managed by the ML toplevel, and files are typically
> open in the sense that they refer to modules that will ultimately be
> bound at the top level during the compilation process.

Interesting you'd mention CM, since that actually uses fully scoped,
lexical name spacing, with no global name space at all. And it
strictly separates module names from file names. ;)

Having said that, I would not recommend CM for comparison in the
context of this discussion, since it only cares about batch
compilation of a set of statically known source files. If you are
inclined to compare to some ML equivalent, then I can surely suggest
one which happens to have a dynamic module system almost exactly like
the one we are discussing here (but substantially more powerful). You
know which one. ;)


> Or you could settle on a local URL to use in the same way, again as
> basically a registry key.
>
>   // in ember.js
>   module jquery = "scripts/jquery.js";
>   import jquery as $;

Of course, that would simply be

  import $ from "scripts/jquery";

(and similarly for your other examples).

And yes, as I said above, something close to this is the only
web-compatible solution. More specifically, I assume you will have to
work with some package manager if you want logical names, and the
packages installed by this manager should be accessed in a manner that
_adheres to standard web practice_. That is, using a URI that properly
denotes either an absolute path, or even nicer, a custom schema:

  import $ from "jsp:jquery"

That makes clear who the authority for the name is, and it is
something entirely different from

  import $ from "jquery"

which refers to a relative path -- at least everywhere else on the web!


> #### The key takeaway
>
> The key takeaway here is this: in the current design, different
> modules from different developers can coordinate based on the
> *internal* names of their dependencies. In your lexical proposal, it's
> not possible to coordinate globally on internal names, because they're
> lexical.  So instead developers would have to coordinate on *external*
> names.  This is fundamentally flawed, because external names are about
> where to get bits, not which abstraction a name represents.

We seem to have some misunderstanding about the nature of internal
names. As far as I'm concerned, in both approaches, coordination is
going through external names. The difference, as far as I can tell, is
that you seem to suggest abusing the URI mechanism for short logical
names that violate URI semantics, whereas I am saying that conformant
integration into the syntactic _and_ semantic structure of URIs is
vital.


> Supporting this use case properly is what led us to realize that the
> earlier lexically-named proposal was flawed.
>
> Note that none of this is about concatenation.  We've made some
> particular design decisions where concatenation played a role, but it
> wasn't a part of the reason we moved away from lexical modules.

Concatenation is the only reason I can see for the non-lexical module
declarations in the current proposal, and if I remember correctly,
that was the main motivation you or Dave gave at the last meeting. Is
that correct?


> #### Some smaller points
>
>> * As an external naming mechanisms, it violates standard relative
>> path/URL semantics.
>
> This isn't done particularly to be compatible, and node or AMD could
> have required "/" in front of absolute paths.  But since that's an
> important use case, we don't think it's a good idea to tax it with
> extra typing.  This convention has become popular in JS for a reason.

>> * The shared name space between internal and external modules can lead
>> to incoherent programs.
>
> I've already pointed out above how these namespaces aren't shared, and
> in fact their separation is an important reason for the current
> design.

I'm not following you here. How are they not shared?


> Also, the incoherent programs you refer to are ruled out by a
> suggestion (of yours!) that we adopted at the last meeting:
> declarative module forms that define a module that already exists are
> a static error.

The suggestion you mention rather deals with the next case (clashes
between different internally defined modules). It does not generally
help when embedded resources clash with externally defined resources.


>> * Likewise, a single global name space for all internally defined
>> modules can lead to incoherent programs.
>
> I would be more worried about this if (a) we didn't provide convenient
> ways to structure this name space and (b) it wasn't already an
> existing successful approach in real-world JS systems.
>
>> * "Local" references are globally overridable.
>
> This is only true in the sense that executing first lets you define
> the meaning of particular modules.  But this is a *feature*.  This is
> just another way to describing configuration.  It's not reasonable to
> think that we can decide for everyone what will need configuration and
> what won't.

Er, I'm not saying that _we_ should decide it. I'm saying that the
implementer of a module should be able to make that choice. In the
current proposal, he doesn't have a choice at all, he is _forced_ to
make each and every module definition essentially "mutable" by the
rest of the world.

So I'm arguing for _more_ choice, not less.


>> * Internal module definition is coupled with external module
>> registration, and modules cannot be renamed or re-registered.
>
> These are not "external" names, but it is true that declaring a module
> registers it.  It's possible to take it out of the table afterward, or
> give it a new name, or have it share multiple names, all by using the
> loader API in very easy ways.  I don't think these will be common
> operations, and this seems like a reasonable tradeoff.

Again you seem to be using "internal" in a different sense than I did.
A name that can be referenced from, or shared with, the outside of a
script's scope is an external name by my book. That goes for all names
in the system you proposed.

I think that defining a module without wanting to register it is the
common case, at least in real code. Explicit registration should only
be needed for more ad-hoc configuration scenarios, or generated by
tools creating bundles. In all other cases, what you refer to with
external names are actual external resources (including logical
package names).


>> * Language-level naming semantics interferes with file system/URL
>> naming semantics.
>
> While it's sad that everyone doesn't use the same filesystem ;), this
> is inevitable unless we force everyone to write *both* internal and
> external names explicitly for every module; that's clearly more
> user-hostile than the alternative.

But you do not want to do that for every module! In fact, you rarely
need to explicitly define a module with an external name at all, at
least as I envision it. You only want to do that for a module that you
want to "export" from your current script, so to speak. Such "export"
(i.e., globally registering a module you defined textually) should be
a very rare operation to write manually (as mentioned, you don't
usually do the equivalent in AMD, for example).


>> * Bundling ("concatenation") generally would require embedding
>> arbitrary string resources, not syntactic modules.
>
> The reason to use concatenation is to avoid consuming excessive client
> resources -- in this setting of course you won't want to run
> translation on the client side.  Translation hooks are important (a)
> in less perf-sensitive settings like development and (b) for isolating
> remotely-loaded code, neither of which require concatenation.

I don't believe it's as clear cut, and I can imagine a number of cool
use cases for translation that do not necessarily fall into this
simple pattern.


>> * Module "declarations" are not a declarative mechanism, but an operational one.
>
> This comes back to my original point.  Registration of module names is
> about coordination, and thus this is an import feature, not a problem.

I agree it's an important feature -- but one already provided by the
loader API. I am not convinced that there is a need to also provide it
in declarative disguise (without really being declarative) -- and
certainly not to conflate it with the (actually declarative) notion of
module definition.

/Andreas


More information about the es-discuss mailing list