Concerns about weak refs and weak maps.
brendan at mozilla.com
Thu Oct 28 19:11:14 PDT 2010
On Oct 28, 2010, at 5:41 PM, Hudson, Rick wrote:
> Hi Brendan,
> I think we all agree that EcmaScript, the language, should not contain threads with shared memory constructs such as locks or transactions. We seem to disagree as to whether EcmaScript, the implementation, should be multithreaded. If bridge builders, compiler writers, GC implementations, need better (non-application visible) tools
[hit |Pause| here]
What's an application? It is hard to make a-priori judgments such as "non-application visible" when we're talking about JS "user code". It could be part of the app, or in a library, but the point is, it's written in JS. Not C++.
> then running their tools concurrently seems the way forward.
See above -- the bridges, etc. are pure JS, and by the agreement, no threads in JS. Ergo no overt concurrency.
I didn't pick on it, but your closing-paragraph phrase "simple efficient concurrent GC" is as far as I can tell an unsolved research problem. The Go language talks about such a GC, but its implementation lacks one at present, AFAIK. Hans Boehm has worked for a long time on researching some pair-wise combinations of words in that phrase, but all four? Not yet, and not in sight from where I sit.
"Efficient, concurrent GC" is achievable, but it is far from "simple", and if you can avoid it, your runtime is ceteris paribus simpler, smaller, less buggy, and possibly quite concurrent enough for your workloads and hardware parallelism.
Shared nothing is a wonderful thing. Concurrent and (in the limit) distributed GC are hairy. Copying is relatively cheap.
There will be domain-specific problems where copying is too expensive (too much data), and some lightweight snapshotting system, some particular solution to the "view/update" problem, with shared immutable data, is needed. But the solutions tend to be quite optimized and specialized, and do not amount to efficient, concurrent, and general-purpose GC.
(This is certainly our position at Mozilla Research in building Rust -- http://blog.mozilla.com/graydon/2010/10/02/rust-progress/.)
> Web workers that share immutable heap object behind a message passing façade could provide reasonable performance without changing the (HTML5) standard.
HTML5 has structured cloning -- http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#structured-clone -- and it is going into Firefox 4.
Again, this, like other parts of current browser codebases, is implemented in C++, not JS. Even with a better systems programming language than C++, which JS is not a candidate to become, JS doesn't obviously need efficient, concurrent GC. V8 has a great generational copying GC, but it is single-threaded and that is a significant part of its win.
> Optimizers running in a separate thread also seem like a good idea.
Sure, but these are not going to be implemented in JS in high-performance ways that depend on shared memory, even if immutable. This does not make the case for concurrent GC trumping the need for weak maps in pure-JS-bridges, runtimes, virtualization layers, security membranes, etc.
> Moving the scans of weak maps into a stop the world GC only makes the GC more intrusive while removing the application writers ability to optimize their applications.
I think we disagree on the inevitability of concurrent GC that trumps pure-JS weak map requirements.
> In the GC literature 50 milliseconds latency is considered the upper bound if one wishes to provide an acceptable client experience. 10 milliseconds is what one wants for a game experience. This is increasingly hard to achieve with a stop the world collector as the memory foot print grows while CPU speed does not.
The state of the art with JS today is a single-threaded, generational copying collector. We don't even have an existence proof of an efficient concurrent GC (leave aside simple). We don't have more than one WeakMap implementation, and the one we have is not (yet) in a generational GC setting.
So it's premature to throw trump cards, and the no-threads-ever-in-JS agreement casts doubt on the need for concurrent GC even under the hood. We're looking at more specialized optimizations to shared-nothing that are strictly simpler.
> In the long run I believe that EcmaScript will not be able to compete with other languages in providing highly responsive client side applications without a concurrent GC.
I'll take the other side of this bet.
> But to answer your direct question:
> >> If you remove the concurrent GC concern, then what is your sense?
> GC latency will increase as memory footprint and use of weak maps increases. Large application using weak maps will have performance bugs that will end up on the GC implementer’s desk where they will be hard to diagnose and fix.
True, WeakMaps up the ante, but nowhere near to the complexity of efficient, concurrent GC -- which does not seem to be on anyone's drawing boards, yet we have more than a few competitive JS engines, and with nodejs we also have a high-performance server-side embedding (there are others coming along).
What's more, this apples-to-oranges argument does not trump the need for WeakMaps in pure-JS code. So if we can implement WeakMaps in single-threaded JS with good GC performance -- to be demonstrated, I agree, but not obviously out the window unless you stack the deck by requiring concurrency -- then why should we deny hungry developers their WeakMap apples, just in case some runtime needs a concurrent GC orange in the future?
> Perhaps if folks talk privately to the top half dozen Java GC implementers EcmaScipt might be able to avoid some of their pain.
Java has threads. JS does not and will not. It would help me if you could re-answer, this time really without any concurrency being dragged in via Java (i.e., without "concurrent GC" being assumed).
More information about the es-discuss