Dataflow concurrency and promises

David-Sarah Hopwood david-sarah at jacaranda.org
Tue Sep 29 20:49:00 PDT 2009


Brendan Eich wrote:
> Beyond this, concurrency via workers is great for certain use-cases but
> not enough for others.
> 
> In TC39 we are talking about formalizing the run-to-completion execution
> model of JS, along with asynchronous message passing concurrency. In
> particular, we're looking at Promises (precedent from E) and Futures
> (differently, in MultiLisp and Alice-ML). At least one contributor on
> es-discuss has advocated lower-level components such as dataflow variables.

That was presumably me:
<https://mail.mozilla.org/pipermail/es5-discuss/2009-May/002557.html>.
However, I don't agree that dataflow variables are "lower level".

> It's too early to predict what we'll do but I hear strong consensus in
> favor of asynchronous messaging and shared-nothing, with higher-level
> abstractions such as Promises favored over lower-level concurrent
> programming features such as dataflow variables.

"Dataflow variable", "promise", and "future" are different kinds of delayed
reference; they are at about the same level of abstraction, and are very
similar to each other. The fact that different terminology is used for
them that obscures the similarities is something of a historical accident.

See <http://en.wikipedia.org/wiki/Futures_and_promises>. (Full disclosure:
I wrote quite a bit of this article.)

There are, however, some differences. Attempting to synchronously use
an unbound dataflow variable will block until it is bound. Attempting
to synchronously use an unbound (unresolved) promise will, at least in E,
throw an exception. Note that in both cases, you can use a 'when' construct
to wait for the delayed reference to become bound/resolved, and then the
observable semantics are the same.


Because the dataflow model (as supported in Oz) does not treat
synchronously accessing an unbound delayed reference as an error,
it is more general: it allows *either* a "dataflow style" of programming
that depends on blocking, or an "event-loop style" that avoids blocking.
An event-loop model (as supported in E) enforces the event-loop style.

I am not arguing that a more general model is necessarily better.
A less general model can in some cases be preferable because it can
be an advantage for all code to be written in the same style.

OTOH, the dataflow model does provide more flexibility: a programmer
can choose to use a pure dataflow style, or a pure event-loop style,
or to combine them. Combining them is not entirely without difficulty,
but that difficulty is restricted to liveness issues rather than safety
issues (that is, the combined programming style is still highly resistant
to low-level race conditions, although more care is needed to avoid
deadlocks). *If* a programmer chooses to use the combined style, for
programs that need it, there is IMHO a significant payoff in expressiveness.

(This is the model that ideally I would like to be able to program in all
the time. Of course you don't have to pay attention to my personal
preference, but there is plenty of support for it among Oz programmers,
and in the book "Concepts, Techniques, and Models of Computer Programming".)

The dataflow model potentially allows easier interoperation with APIs
that are not based on event loops.


It also seems that there is some confusion about "asynchronous messaging
and shared-nothing". First, I agree completely that we want to support
asynchronous message passing in JavaScript. In fact that combines very
naturally with dataflow variables, just as well as it does with promises.

"Shared-nothing" is a term used in the Erlang community to contrast
message-passing models with shared-memory models. The term is
inaccurate; it is not the case that processes "share nothing" in
Erlang or in other message-passing languages. The restriction that
this term is intended to refer to is that processes do not have
synchronous shared access to the same mutable state.

The following forms of sharing between processes are consistent with
maintaining the advantages of message-passing models relative to
shared-memory models:

 - processes can share references to other processes.

   In the case of a vat-based model (where vats are the units of
   concurrency and each object belongs to a vat), an object can
   have a reference to an object in another vat, but it can only
   use the reference by *asynchronous* message passing. References
   within the same vat can *also* be used via synchronous message
   passing.

   The model I would suggest for JavaScript would be like E in
   this respect: it provides significant and useful additional
   expressiveness. The partial isolation between vats, which
   allows vats to fail or be destroyed independently, is also
   very useful in a web context. For example the default behaviour
   (when no additional vats are created explicitly), could be that
   a vat is created for each JavaScript context.

 - processes can have shared access to a single copy of a deeply
   immutable structure. This is equivalent to copying the structure,
   except for the important issue of reduced memory usage.

 - processes can have shared access to declarative structures --
   that is, structures that can be extended but not mutated.
   This is in practice relatively easy to reason about, and does
   not introduce the same programming difficulties as a
   shared-memory model.

   (It does introduce a limited form of nondeterminism: if two
   processes attempt to make a conflicting extension, the program
   will fail. This is a programming error. Programs without such
   errors behave deterministically, and programs with such errors
   deterministically fail, but the side-effects that occur before
   they fail may be nondeterministic.)

-- 
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com



More information about the es-discuss mailing list