Module linking (Was: The global object should not be the "global scope instance object")

Allen Wirfs-Brock allen at wirfs-brock.com
Mon Jan 30 10:17:51 PST 2012


On Jan 30, 2012, at 5:00 AM, Andreas Rossberg wrote:

> On 28 January 2012 02:08, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:
>> I played around a bit to see if I could come up with a troublesome example
>> of the sort you may be thinking about.  What I came up with is the follow:
>> 
>> <script>
>> module a {
>>    import {x:y} from b;
> 
> I think you wanted to say {y:x} here.

no, I think  {x:y}  means creating a binding for x that is linked to b.y

> 
>>    module b{
>>       export let y = x;  //essentially this is let y=y
>>     }
>> }
>> </script>
>> 
>> 
>> 1)The script is parsed, and static semantic checks are made.  There are no
>> static errors.
> 
> Just to be clear: the static semantics is where most of the
> complication of modules lie. Depending on what exactly we want to
> allow, it amounts to a non-trivial type inference problem.

The static semantics I have in mind are all about disallowing duplicate declarations and hosting. It's about names not about values. I don't see where any type-like analysis comes into play.
>  
>> 2) module instantiation is performed for the block.  This instantiates each
>> module defined by the top level of the block, instantiating a module
>> includes producing the list of identifiers exported by the module. Each
>> identifier is associated with a new uninitialized binding. Instantiated
>> modules are not initialized (their body is not executed) at this time.
> 
> You need a new uninitialized binding for all identifiers in the
> module's local scope, not just the exported ones.
The non-exported declarations within a module don't need to be instantiated until module initialization. You probably could instantiate them durning my module instantiation step but I'm trying to only identify what must happen at that step.
> 
>> 3) An initialized binding for "a" is is created in the top level environment
>> for the script.  (all top level binding are instantiate at this point, if
>> there were any others).  Note that the binding for a is initialized (it
>> reference a module instance object) but the module itself is not yet
>> initialized
>> 4) initialize module a
>>      5) module instantiation is performed for the body of module a.  This
>> instantiates a module instance for module b with exported identifier "y" and
>> its binding.
> 
> It's not quite that simple. Recursive initialization does not suffice,
> because there can be arbitrary import/export/aliasing cycles (e.g., b
> could import from a as well). Consequently, you generally need to
> create the instance object for _all_ modules in the program first,
> before you can start initializing _any_ of them (or the toplevel, for
> that matter).

I'm trying to demonstrate an algorithm that takes care of this.  Give me a specific example that you don't think works.  Note that my algorithm, at each lexical level, first looks at the exports of all the modules that are immediately defined (or referenced) within it.  It is only those exports that add names to the current level.  However, in general, the value that is accessed via the exported name is not yet relevant.

> 
>>      6) An initialized binding for "b" is is created  in module a's inner
>> environment; (but module b is not yet initialized)
>>      7) An binding for  "x"  is created in module a's inner environment.
>>  The binding is linked to the binding of "y" exported from b.  Both bindings
>> share the same initialization state.  (currently uninitialized)
>>      8) initialize module b
>>             9) The binding for  for "y" that was created when module b was
>> instantiated is added to  module b's inner environment
>>             10) evaluate the LHS of the exported let;  the binding found for
>> "x" is uninitialized so we throw and the script terminates.
> 
> I suppose you mean RHS here and below.
oops, yes

> 
>> If evaluating LHS didn't have any dependencies upon uninitialized bindings
>> (say it was a constant or a function expression) we would continue as
>> follows:
>> 
>>             11) set the "y" binding to the value of the LHS and mark "y" as
>> initialized, this also mark the "x" binding in module a as initialized
>>      12) module b is not fully initialized
>> 13) module a is not fully initialized
>> 
>> I've only mentally walked through the steps but it looks to me like this
>> process will also work for circular dependencies such
>> as http://wiki.ecmascript.org/doku.php?id=harmony:modules_examples#cyclic_dependencies
> 
> Here is what my toy source-to-source translator generates for your example:
> 
> "use strict";
> {
> // Create:
> const _ = {};
> const __a = {};
> const __a_b = {};
> // Link:
> Object.defineProperty(__a_b, "y", {get: function() { return __a_b_y }});
> Object.freeze(__a_b);
> Object.freeze(__a);
> Object.freeze(_);
> // Run:
> let __a_x = __a_b_y;
> let __a_b_y = __a_x;
> }
> 
> Executing this snippet on V8 throws a reference error as expected,
> when evaluating the first let binding.  (The additional __a_x is due
> to my translator not supporting import renaming, so I expanded "import
> {y:x} from b" to "import y from b; let x = y".)
> 
> More generally, in a program with modules, you basically have three
> phases of execution:
> 
> 1. Instantiation. Bindings to fresh instance objects are created for
> all modules in the program, and uninitialized bindings for all other
> (non-local) bindings in the program (i.e., hoisting the let's above).
> 2. Linking. The exported bindings of each module are installed on the
> respective instance objects. (For exported modules, these are data
> properties carrying the already created instance objects, for others,
> accessor properties forwarding to the yet uninitialized bindings.)
> 3. Execution. All non-module bindings' RHSs are evaluated in order of
> appearance, initializing the respective bindings.
> 
> Note that each phase is a separate recursion over the whole program.

I agree that there are logically three phases.  However, I'm not yet convinced that a three separate traversals over the entire program is needed.  I still think it can be interweaved within a single breadth-first traversal that is driven off of the initialization (really evaluation phase) of a module body.   The "trick" is that each level peaks at the names (but not values) exported by the modules it immediately defines (or imports). The exported name  list of a module can be generated while the module is parsed, prior and prior to any execution of the Program.. 

Allen


More information about the es-discuss mailing list