Dataflow concurrency instead of generators

Graydon Hoare graydon at mozilla.com
Fri May 15 16:20:26 PDT 2009


On 15/05/09 11:15 AM, Jason Orendorff wrote:

> Even the simplest generator use cases can't be implemented using
> threads and dataflow variables in that kind of model.  To take a silly
> example:
>
>    function iter(arraylike) {
>        for (var i = 0; i<  arraylike.length; i++)
>            yield arraylike[i];
>    }
>
>    for (elt in iter(document.getElementsByTagName("A")))
>        ...
>
> DOM NodeLists are mutable and so are the nodes being yielded.

Oh, this will *work* in the supposed "dataflow" system. It works in the 
libraries David-Sarah pointed to. It'll just be completely unsafe in the 
bolt-on-the-language cases (i.e. "not Oz"). Because in those cases, the 
"dataflow" system is being used as little more than a voluntary 
half-duplex channel scheme. Key point being voluntary. Threads can still 
interfere via the shared graph, bypassing the channels. Which everyone will.

Unfortunately ES, ruby, python, etc. are all cyclic-memory, 
universal-shared-ownership OO languages, so you are transmitting nodes 
in your shared graph between threads. Even if you lock each thread in a 
box, you can't reasonably make a full copy of the world on each 
transmission, and confinement isn't designed-in to the ownership 
structure, so you're forced to do something cheesy like "attempt deep 
copy and throw when someone tries to transmit any cyclic bits" as a 
heuristic for preventing transmission of something that might be 
shared-ownership.

This is a design problem in the language. You want to be writing in 
newsqueak or erlang or something properly designed for channels (CoW 
acyclic ownership, or immutability, take your pick). ES isn't. This is 
why workers[1] live in isolated universes and communicate via JSON (or 
possibly this has been changed to an acyclic "structured clone"[2]). 
Anyway, both approaches do the "deep copy and throw on cyclic" thing. 
It's a kludge, but the best you can do here. The language wasn't 
designed to blur concurrency and iteration. If you try, you'll just get 
a mess (or systematic lack of safety).

Generators are comparatively much lighter (eg. no OS thread) and can be 
used to structure walks through existing object graphs, by a single 
thread, without being locked in a box or marshalling yield values. 
They're just an iteration-modularity device.

-Graydon

[1] http://www.whatwg.org/specs/web-workers/current-work/
[2] http://www.w3.org/TR/html5/infrastructure.html#structured-clone


More information about the es-discuss mailing list