Function#fork
David Bruant
bruant.d at gmail.com
Mon Sep 24 07:12:05 PDT 2012
Le 24/09/2012 14:43, Jussi Kalliokoski a écrit :
> Hello everyone,
>
> I've been thinking a lot about parallel processing in the context of
> JavaScript, and this is really a hard problem. I'm very curious to
> hear what everyone's opinions are about it's problems and so forth,
> but I don't think an open question like that will give very
> interesting results, so I have an example problem for discussion
> (while it seems like a bad idea to me, and unlikely to ever get to the
> language, what I want to know is everyone's reasoning behind their
> opinions whether it's for or against).
The concurrency strawman [1] defines a concurrency as well as
parallelism model. So far, it's been expressed as the favorite model for
general purpose parallelism.
Different use cases are efficiently solved by different forms of
parallelism, for instance, there is another strawman on data parallelism
[2] for the case of applying the same computation to a large amount of data.
> What if we introduce Function#fork(), which would call the function in
> another thread that shares state with the current one.
Shared state (no matter how much) always has the same story. 2
computations units want to access the shared state concurrently, but for
the sake of the shared state integrity, they can't access the state
simultaenously. So we need to define a form of mutex (for "MUTual
EXclusion") for a computation unit to express the intention to use the
state that should be used by one computation unit at once. With mutexes
as we know them, used at scale, you end up with deadlocks which are
nasty bugs to find out and debug.
This is all a consequence of the idea of shared state.
Of all this story, 2 parts can be attacked to fix the problem. Either,
define something better than what we know of mutexes (I have no idea of
what it would look like, but that's an interesting idea) or get rid of
shared state.
The current concurrency strawman is doing the latter.
One annoying thing of naive no-shared-state systems as we know them is
that everything has to be copied from a computation unit to another.
That's not exactly true though. It's always possible to implement a
copy-on-right mechanism.
Another idea is to define ownership over data. HTML5 defines
"transferable" objects [3] which can be passed back and forth form
worker to worker but can always be used in one worker at a time. Rust
has a concept of "unique pointer" which is the same idea.
Another idea would be to have data structures which live in 2 or more
computation units, showing just an interface to each and which integrity
would be taken care of under the hood by the "VM" and not client code.
This is what local storage does for instance.
I will fight very hard against the idea of shared state, because there
are very few benefits against all what it costs in large-scale programs.
David
[1] http://wiki.ecmascript.org/doku.php?id=strawman:concurrency
[2] http://wiki.ecmascript.org/doku.php?id=strawman:data_parallelism
[3]
http://updates.html5rocks.com/2011/12/Transferable-Objects-Lightning-Fast
More information about the es-discuss
mailing list