Module linking (Was: The global object should not be the "global scope instance object")

Andreas Rossberg rossberg at google.com
Mon Jan 30 12:09:45 PST 2012


(Just a quick reply, because I'm in a bit of a hurry.)

On 30 January 2012 19:17, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:
>>> <script>
>>> module a {
>>>    import {x:y} from b;
>>
>> I think you wanted to say {y:x} here.
>
> no, I think  {x:y}  means creating a binding for x that is linked to b.y

It's supposed to be read as a destructuring. At least that was my
understanding so far.

>>>    module b{
>>>       export let y = x;  //essentially this is let y=y
>>>     }
>>> }
>>> </script>
>>>
>>>
>>> 1)The script is parsed, and static semantic checks are made.  There are no
>>> static errors.
>>
>> Just to be clear: the static semantics is where most of the
>> complication of modules lie. Depending on what exactly we want to
>> allow, it amounts to a non-trivial type inference problem.
>
> The static semantics I have in mind are all about disallowing duplicate declarations and hosting. It's about names not about values. I don't see where any type-like analysis comes into play.

Modules have structural interfaces (their lists of exports, and
recursively, the interfaces of those) that you have to infer to do the
static analysis and name resolution. These interfaces can naturally be
viewed as types (not in the JS sense, of course), and the analysis can
naturally be stated as a type inference problem (you can also
formalise it differently, but it doesn't make a difference). That's
really the essence of being "static".

The analysis is not as straightforward as it might seem. In particular, this:

> The exported name list of a module can be generated while the module is parsed

unfortunately is not true, because of (1) alias bindings, and (2)
import *. For example, with

  module A = B.C

you won't know what A exports before you haven't analysed B and its
nested C, which might come later in the program. And those can
recursively refer to A, which might require you knowing what A
exports. Here is a more messed up, random example:

  module A = B.C
  module D { ... }
  module B {
    let x = B.E.x
    import * from A
    export module B {
      export module E = D
      export module C {
        export module D { ... }
      }
    }
  }

You couldn't tell which D is aliased by E before processing C and so
on. In general, it's all recursive, so a simple AST traversal alone,
in whatever order, won't do. Instead you need to collect and solve
constraints. In terms of type inference, you infer and unify
(principal) types.


>>> 2) module instantiation is performed for the block.  This instantiates each
>>> module defined by the top level of the block, instantiating a module
>>> includes producing the list of identifiers exported by the module. Each
>>> identifier is associated with a new uninitialized binding. Instantiated
>>> modules are not initialized (their body is not executed) at this time.
>>
>> You need a new uninitialized binding for all identifiers in the
>> module's local scope, not just the exported ones.
> The non-exported declarations within a module don't need to be instantiated until module initialization. You probably could instantiate them durning my module instantiation step but I'm trying to only identify what must happen at that step.

I'm not so sure. You could export parts of a local module:

module A {
  module B { export module C { ... } }
  export module C = B.C
}
module C = A.C


>>> 3) An initialized binding for "a" is is created in the top level environment
>>> for the script.  (all top level binding are instantiate at this point, if
>>> there were any others).  Note that the binding for a is initialized (it
>>> reference a module instance object) but the module itself is not yet
>>> initialized
>>> 4) initialize module a
>>>      5) module instantiation is performed for the body of module a.  This
>>> instantiates a module instance for module b with exported identifier "y" and
>>> its binding.
>>
>> It's not quite that simple. Recursive initialization does not suffice,
>> because there can be arbitrary import/export/aliasing cycles (e.g., b
>> could import from a as well). Consequently, you generally need to
>> create the instance object for _all_ modules in the program first,
>> before you can start initializing _any_ of them (or the toplevel, for
>> that matter).
>
> I'm trying to demonstrate an algorithm that takes care of this.  Give me a specific example that you don't think works.

If I understand your description correctly, then consider this:

  module A { export module C = B.C; export module D { ... } }
  module B { export module C { ... }; export module D = A.D }

If you haven't created the instance object for B.C already, then how
do you initialise the binding for A.C? And because D works the other
way round, there is no way you can reorder the modules either, so that
one pass would work.

> Note that my algorithm, at each lexical level, first looks at the exports of all the modules that are immediately defined (or referenced) within it.  It is only those exports that add names to the current level.  However, in general, the value that is accessed via the exported name is not yet relevant.

I think to initialise either module alias bindings or import bindings
you need the value already, in order to do the proper linking at this
step.

/Andreas


More information about the es-discuss mailing list