Module naming and declarations

Brian Di Palma offler at gmail.com
Thu Apr 25 04:00:21 PDT 2013


I've been following es-discuss for a short amount of time.
I'm a JS dev working on a significant code base, this biases how I
perceive ES6 issues.

>From my viewpoint by far the most important advancements provided by
ES6, eclipsing all others,
are modules and classes. This feeling is widely shared among the
developers I work with.

So I'm somewhat surprised at the lack of response to Andreas email.
Firstly I agree with Andreas point about there being an issue with the
naming and declaring of modules.

|  In the original module proposal, modules were declared via ordinary
|  lexical identifiers, but could also be imported from external sources
|  denoted by strings. In November, the proposal was changed to use
|  string-valued "module ids" for both. The motivation was to simplify
|  the language, and to provide better support for configuration and
|  concatenation.

This seems an odd change to make, a backward one in fact.

I presume that the aim of modules is to provide clean
scopes/environments, to prevent global state pollution and to aid in
structuring/separating code. Therefore you would wish modules to
provide an
abstract identifier as opposed to a concrete file path string as its identifier.

module topLevelNamespace.subNamespace {
   export MyClass {
   }
}

|  Nobody would suggest to use file paths in place of variable
|  identifiers internally. Yet, that is almost exactly what the proposal
|  does for modules!

Indeed. Working on a large code base containing hundreds of JS classes
I think it's cleaner to
deal with abstract identifiers which to correspond to namespaces as
opposed to file
locations on disk.

|  As various discussions show, people want and expect scope chain
|  behaviour, and for very good reasons: e.g. nesting modules, confining
|  modules to local scope, convenient local names, etc.

Yes. If we create private packages in our frameworks/libraries I see
no reason for any end consumer
to have access to these internal artifacts. This is all about working
with large code bases,
privacy and integrity are very helpful in those situations.

|  (The path semantics is inherited from legacy module frameworks
|  for JS, such as AMD. It is a fine solution under the constraints that
|  these frameworks have to operate in -- in particular, the inability to
|  extend syntax or add new primitives. However, for ES6 most of these
|  constraints don't apply, and there is no particular reason to limit
|  the design to making the same compromises.)

To produce such a design purely to serve the needs of old module
systems seems a poor choice to me.
Over time the standard mechanism will eclipse all other module systems
even if it provides no upgrade
path for the old systems purely because it is the standard system. As
long as it is not a totally broken
design. The only time I would be willing to invest learning about the
old module systems would be when
I want to convert an old module to the new system so I can dump the old system.

There is far more code that is not using modules than code that is.

The focus should be on creating the best possible module system not
the best possible system that
smoothly accommodates AMD modules!

|  A custom loader can, in principle, perform arbitrary interpretation or
|  rewriting of URLs. In particular, this could be used to implement
|  interop to absolute repository paths a la AMD or Node, e.g. by
|  interpreting an "amd:" schema for importing AMD modules that are
|  relative to a separately configured base URL. In other words, you'd
|  write
|
|    import M1 from "a/b";  // native ES6 import, relative path
|    import M2 from "amd:c/d";  // import of AMD module, relative to
AMD base URL

Schema handlers for non standard resources seems like an excellent idea.
Neat, clean and easy to identify. I would support that, it could also
be used to load non JS resources.

Thank you Andreas for highlighting these niggles I do hope there is
more interest in them then the lack of response indicates.

- Brian Di Palma.

On Wed, Apr 24, 2013 at 1:28 PM, Andreas Rossberg <rossberg at google.com> wrote:
> The module proposal has made good progress, thanks to the hard work by
> Dave & Sam. I'm glad to see it close to the home stretch for the ES6
> race (some healthy minor controversies on the side notwithstanding :)
> ).
>
> However, there is one central part of the design over which we still
> do not have agreement: the naming and declaration mechanism for
> modules. We did not yet have a serious discussion about it -- and I
> fully admit this being my fault as well, since despite my repeated
> criticism, I hadn't found the time to put that in a coherent form. I
> try to make up for that with this post. :) Summary is that I still
> (strongly) believe that adopting what's currently on the table would
> be a severe mistake, and that it's best to take a step back and
> discuss the motivation as well as alternatives.
>
> My sincere apologies for the excessively long post...
>
>
> ** The problem **
>
> In the original module proposal, modules were declared via ordinary
> lexical identifiers, but could also be imported from external sources
> denoted by strings. In November, the proposal was changed to use
> string-valued "module ids" for both. The motivation was to simplify
> the language, and to provide better support for configuration and
> concatenation.
>
> Unfortunately, the new scheme leads to a significant conflation of
> concerns. In particular, it tries to obscure the fact that there are
> fairly fundamental differences between _internal_ names and _external_ ones:
>
> - internal names reference _language entities_, external names
> reference _resources_;
> - internal references are under control of the language, while
> external references are delegated to a platform mechanism (with
> potentially platform-dependent interpretations and behaviour);
> - internal references are definite, stable, and high-integrity, while
> external references are indefinite, late-bound, and can fail or be
> clobbered.
>
> As an analogy, take strings. A string may exist inside the language,
> as a value bound to a variable. Or it may be stored in some file that
> you can access via a file name. Both the variable and the file name
> ultimately reference "a string", but in completely different worlds.
> Nobody would suggest to use file paths in place of variable
> identifiers internally. Yet, that is almost exactly what the proposal
> does for modules!
>
> Conflating the two notions may seem tempting and convenient at first,
> but it's bad. In the case of the current ES module proposal in
> particular, the attempt to do so with the "module id" scheme has all
> kinds of awkward consequences:
>
> * As an internal naming mechanisms, it lacks standard scoping semantics.
>
>   As various discussions show, people want and expect scope chain
> behaviour, and for very good reasons: e.g. nesting modules, confining
> modules to local scope, convenient local names, etc. The module id
> approach cannot sanely support that (which is why e.g. nested modules
> got pulled from the proposal).
>
> * As an external naming mechanisms, it violates standard relative
> path/URL semantics.
>
>   When using paths to actually reference external names, one might
> likewise expect certain semantics, e.g., that "a" and "./a" refer to
> the same thing, like they usually do on the web or in a file system.
> The current mechanism intentionally breaks with this via a
> non-standard interpretation of "paths". It implies that a set of
> module files, by default, is not relocatable within a project tree, or
> across project trees, as the standard idiom for module names actually
> denotes semi-absolute paths within a project tree. The main reason for
> it is that paths are overloaded to serve both internal and external
> naming. (The path semantics is inherited from legacy module frameworks
> for JS, such as AMD. It is a fine solution under the constraints that
> these frameworks have to operate in -- in particular, the inability to
> extend syntax or add new primitives. However, for ES6 most of these
> constraints don't apply, and there is no particular reason to limit
> the design to making the same compromises.)
>
> * The shared name space between internal and external modules can lead
> to incoherent programs.
>
>   Internal and external module definitions can resolve to the same
> path. For example. there might be a definition, somewhere, for module
> "a/b", but also a file "a/b.js". Which one takes precedence generally
> depends on the execution order of a (staged) program (e.g., when
> _other_ imports are performed), and can, in fact, differ at different
> points in time. It is worth noting that, presumably for this reason,
> AMD strongly discourages the use of named module declarations, except
> by optimization tools (and Node does not support it at all, AFAICT).
> With the ES proposal, however, nifty syntax strongly conveys the
> impression that named module declarations are a good and recommended
> feature to use manually, instead of discouraging their day-to-day use.
>
> * Likewise, a single global name space for all internally defined
> modules can lead to incoherent programs.
>
>   Several internally defined modules can clash arbitrarily, there is
> nothing preventing two completely unrelated modules in completely
> unrelated files from clobbering the same name and stepping on each
> other's feet. Worse, there is no way to confine a module to a local
> scope, all definitions _have to be_ globally visible and compete
> within the same global name space. (And unlike URLs, this name space
> is not particularly structured.) Of course, conventions can help to
> work around the problem in practice, but clearly, a well-designed
> language mechanism is preferable and more reliable.
>
> * "Local" references are globally overridable.
>
>   Since every reference goes through the loader, there is no way to
> have a definite (i.e., static, stable) module reference, even within a
> single script or scope. Any other script can come by and _modify the
> meaning_ of any seemingly local module reference, either accidentally
> or intentionally (unless laboriously sandboxed). This clearly is bad
> for abstraction, encapsulation, integrity, security, and all related
> notions. And to add insult to injury, the loader also induces
> non-trivial runtime cost for a mechanism that isn't even desirable in
> these cases. Plain and simple, we repeat the mistake of the JS global
> object, but worse, because there is no other scope for modules to
> escape to if you care about integrity.
>
> * Internal module definition is coupled with external module
> registration, and modules cannot be renamed or re-registered.
>
>   It is not possible to define a module without registering it
> globally, and dually, it is not possible to register a module without
> defining a new one. In particular, that prevents renaming a module,
> and more importantly, registering a module under a different
> (external) name than the one under which it was defined/imported. The
> latter, however, is needed for some configuration use cases. (In the
> current proposal, it has to be simulated awkwardly by "eta-expanding"
> the renamed module, i.e., declaring "module 'A' { export * from 'B';
> }", which creates a separate module.)
>
> * Language-level naming semantics interferes with file system/URL
> naming semantics.
>
>   As other communities have rediscovered many times before, it is a
> problem to naively map between internal and external names, because
> the meaning of internal definitions or references may then depend on
> idiosyncrasies of the hosting environment. For example, it may affect
> the program behaviour whether names are case-sensitive in the OS (not
> all JS hosts are browsers), or what makes a well-formed file name.
> What if you define a module "M" and also have a file "m.js"? Something
> else in node.js on Windows than on Linux, potentially. It is best to
> minimise the problem by limiting the use of external names to actual
> external references.
>
> * Bundling ("concatenation") generally would require embedding
> arbitrary string resources, not syntactic modules.
>
>   One powerful feature of loaders is the translation hook. When
> importing a module from an external source, there is no restriction on
> its syntactic content, since a translation hook can transform it
> freely. But if one were to bundle ("concatenate") an application that
> actually makes use of this liberty, then the current proposal could
> not actually support that consistently. Consequently, if one primary
> goal the current proposal is to make module declarations a mechanism
> for bundling modules, then they are an incomplete solution. In
> general, you'd need to be able to embed arbitrary _resources_ (as
> strings), and be able to run translate hooks on those.
>
> * Module "declarations" are not a declarative mechanism, but an operational one.
>
>   Because module declaration is coupled with loader registration, it
> has a non-trivial operational effect, and its semantics is in turn
> subject to interference from other operational effects from inside and
> outside the program (as described above). Yet, it is disguised in a
> seemingly innocent declarative syntax. That is likely to create
> pitfalls and wrong assumptions (again, like with the global object).
>
>
> In summary, not only do path-named module declarations lack desirable
> expressiveness, regularity, and integrity, they also do not support
> more interesting configuration and concatenation use cases, which is
> what they were intended for.
>
> Most of the above problems cannot be fixed without adding lexically
> scoped modules to the language. It seems very clear that we need
> those, rather sooner than later. Also, I think we want a more
> structured approach to the global name space for external modules. At
> that point, rethinking the proposed approach may be the best idea.
>
>
> ** Proposal **
>
> I think it is highly advisable to follow a simple strategy for the
> whole naming business: avoid ad-hoc inventions, stick to
> well-established, standard mechanisms.
>
> Specifically:
>
> 1. Have a clean separation of concerns between internal and external names.
> 2. For internal names, use the standard language mechanism: lexical scope.
> 3. For external names, use the standard web mechanism: URLs.
> 4. Have a clean separation of concerns between the declarative
> _definition_ of a module, and the operational notion of _registering_
> it with a loader.
>
> Let me point out again that both lexical scope and URLs are backed by
> decades of experience and have proved superior over and over again, in
> many, many different languages and environments. _We can only lose if
> we try to do "better"_.
>
> More concretely, I envision using lexical module declarations as the
> primary means for defining modules (you guessed that from the start :)
> ). Modules can nest. (Supporting local modules, like Mark suggested,
> is more difficult, because of unclear interactions with other
> constructs like classes or eval. Certainly not ES6.)
>
> There may (or may not, see below) be a (separate?) pseudo-declarative
> form for registering modules as resources with the loader -- however,
> if so, it should ideally scale to handle non-ES sources.
>
> Module resources are identified by URLs. Those can be absolute or
> relative; relative URLs are interpreted relative to the importing file
> (just like HTML links). The loader table only contains absolute URLs.
> Likewise, every script is associated with its absolute URL. Any
> relative import is first normalised to absolute using the absolute URL
> of the importer (plus obvious steps for normalising occurrences of "."
> and "..").
>
> A custom loader can, in principle, perform arbitrary interpretation or
> rewriting of URLs. In particular, this could be used to implement
> interop to absolute repository paths a la AMD or Node, e.g. by
> interpreting an "amd:" schema for importing AMD modules that are
> relative to a separately configured base URL. In other words, you'd
> write
>
>   import M1 from "a/b";  // native ES6 import, relative path
>   import M2 from "amd:c/d";  // import of AMD module, relative to AMD base URL
>
> Should we have a declarative form for registering resources, then its
> URL would be resolved in the same manner, relative to the path of the
> containing file. However, the programmer is free to use an absolute
> URL.
>
> At that point, the only remaining purpose for path-named module
> declarations would be registering external references. However,
> registration is already possible through the loader API. Doing so
> requires additional staging (a script setting up the loader before
> executing the actual script). But staging is necessary anyway, for
> every slightly more interesting configuration case -- e.g. any
> scenario that involves loader.ondemand, translation, etc. It is not
> clear to me that the remaining cases justify an additional semantic
> short cut, and I think there is a case to be made that path-named
> module declarations are neither sufficient _nor_ necessary. But that
> point is mostly independent from the rest.
>
> Clearly, there are more details to be discussed and worked out. I also
> know that Dave & Sam have been over lots of it before. Still, I'm
> positive that it is a fairly solvable problem, and will yield a more
> well-behaved and more scalable solution. And yes, essentially it means
> reverting this part of the module proposal to its earlier stage -- but
> the good news is that not much else in the proposal is affected. :)
>
> /Andreas
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss


More information about the es-discuss mailing list