What is the status of Weak References?
bruant.d at gmail.com
Fri Feb 1 05:43:27 PST 2013
Le 01/02/2013 12:21, Kevin Gadd a écrit :
> On Fri, Feb 1, 2013 at 2:06 AM, David Bruant <bruant.d at gmail.com> wrote:
>> I don't understand the connection between the lack of weak references and
>> emulating a heap in a typed array.
> For an algorithm that needs weak references to be correct, the only
> garbage collector and write your own collector. This is basically the
> model used by Emscripten applications compiled from C++ to JS - you can use a
> C++ weak reference type like boost::weak_ptr, but only because the
> entire application heap is stored inside of a typed array and not
> exposed to the JS garbage collector. This is great from the
> perspective of wanting near-native performance, because there are JS
> runtimes that can turn this into incredibly fast native assembly, but
> disadvantages, so that is why I bring it up - weakref support in JS
> would make it possible to express these algorithms in hand-written,
> readable, debuggable JS.
Sorry for repeating myself, but I still don't see the connection between
the lack of weak references and emulating a heap in a typed array.
Phrased as a question:
Would it be possible to compile a C++ program in JS with weakrefs
without emulating a heap in a typed array? Because of pointer
arithmetics, I doubt it, but I'm curious to learn if that's the case.
>> Garbage collectors have evolved and cycles aren't an issue any longer, weak
>> references or not.
> Cycles are absolutely an issue, specifically because JS applications
> can interact with systems that are not wholly managed by the garbage
> collector. The problem in this case is a cycle being broken *too
> early* because the application author has to manually break cycles. To
> present a couple simple examples:
> I have a top-level application object that manages lower-level 'mode'
> objects representing screens in the application. The screens, when
> constructed, attach event listeners to the application object. Because
> the application manages modes, it needs to have a list of all the
> active modes.
> * The event handler closures can accidentally (or intentionally)
Last I heard, it's very difficult to accidentally capture a reference in
a closure because modern engines check which objects are actually used
(looking at variable names), so for an object to be captured in a
closure, it has to be used. So "intentionally".
> capture the mode object, creating a real cycle involving a dead mode
> that will not be collected by even the most sophisticated GC.
The problem is not about cycles. It's about abusively holding references
> * If I am not extremely cautious, when a mode is destroyed I might
> forget (or fail) to remove its associated event handlers from the
> event handler list, causing the event handler lists to grow over time
> and eventually degrade the performance of the entire application.
> * I have to explicitly decide when a mode has become dead
Yes. I would say "understand" rather than "decide", but yes. And that's
a very important point that most developers ignore or forget. GC is an
undecidable problem, meaning that there will always be cases where a
human being needs to figure out when in the object lifecycle it is not
longer needed and either free it in languages where that's possible or
make it collectable in languages with a GC. There will be such cases
even in languages where there are weak references.
Nowadays, making an object collectable means cutting all references
(even if the object is not involved in a cycle!) that the mark-and-sweep
algorithm (as far as I know, all modern engines use this algorithm)
> In this scenario, weak references are less essential but still
> tremendously valuable: An event handler list containing weak
> references would never form a cycle, and would continue to work
> correctly as long as the mode is alive. It is also trivial to prune
> 'dead' event handlers from a list of weak event handlers.
When does the GC decide to prune dead event handlers? randomly? Or maybe
when you've performed some action meaning that the corresponding mode is
> The need to
> explicitly tag a mode as dead and break cycles (potentially breaking
> ongoing async operations like an XHR) goes away because any ongoing
> async operations will keep the object itself alive (even if it has
> been removed from the mode list), allowing it to be eventually
> collected when it is safe (because the GC can prove that it is safe).
> I decide to build a simple pool allocator for some frequently used JS
> objects, because JS object construction is slow. This is what
> optimization guides recommend.
Are these guides aware of bump allocators? or that keeping objects alive
more than they should pressures generational garbage collectors?
> I pull an object instance out of the
> pool and use it for a while, and return it to the pool.
> * If I forget to return an object to the pool when I'm done with it,
> it gets collected and eventually the pool becomes empty.
> * If I mistakenly return an object to the pool when it actually
> escaped into a global variable, object attribute, or closure, now the
> state of the object may get trampled over if it leaves the pool again
> while it's still in use.
> * If I mess up my pool management code I might return the same object
> to the pool twice.
I'm sorry, but all your examples are "if I forget, if i make a
mistake...". I don't think making bugs are a good justification to add
new features in a language. If you really care about memory, make your
algorithms right, spend the necessary time to understand the lifecycle
of your own objects to understand when to release them.
> In this scenario, weak references would allow you to make the pool
> implementation wholly automatic (though that would require the ability
> to resurrect collected objects - I'm not necessarily arguing for that
> feature). I should point out that this scenario is complicated by JS's
> lack of an equivalent to RAII lifetime management in C++ and the
> 'using' block in C# (you can vaguely approximate it with try/finally
> but doing so has many serious downsides) - given RAII or a 'using'
> equivalent, you could manually ref-count pool entries instead of using
> weakrefs. But I hope you can see the general gist here of solving a
> problem the GC should be solving?
> These examples are simplified but are both based on real world
> applications I've personally worked on where the listed issues caused
> us real grief - crashes and leaks from buggy manual lifetime
> management, inferior performance, etc.
>> I'm not part of TC39, but I'm largely opposed to anything that makes GC
>> observable. It introduces a source of non-determinism; that is the kind of
>> things that brings bugs that you observe in production, but unfortunately
>> didn't notice and can't reproduce in development environment. Or if you
>> observe them when running the program, you don't observe it in debugging
> My argument here is not that non-determinism is good. My argument is
> that an application that runs non-deterministically in every web
> application that deterministically doesn't run in any web browser
> because the application cannot be expressed accurately in JS.
:-) Interesting argument.
> It is
> possible that the set of these applications is a small set, but it
> certainly seems of considerable size to me because I encounter these
> problems on a regular basis. The developers that I speak to who are
> building these applications are being forced to choose Native Client
> or Emscripten because their applications are not expressible in JS.
I don't know enough languages to tell, but I wonder until which point
should JS import other language features for the sake of porting programs.
Where are the JS equivalent of Scala actors? There are probably some
very interesting Scala programs to port to the web?
> I'm personally developing a compiler that targets JS and the lack of
> weak references (or RAII/'using') dramatically limits the set of
> programs I can actually convert to JS because there are lots of
> applications out there that simply need this functionality.
ES6 introduces revokable proxies  which could be used to implement as
"explicit weakrefs" (you need to say explicitely when you don't want to
use an object anymore).
One idea would be to add some source annotations to tell at a coarse
level when some object is guaranteed to be not needed anymore. It would
compile to revoking the proxy.
> If this is
> something that can't be done in JS, or isn't possible until ES7/ES8, I
> understand, but I would be very disappointed if the only reasons for
> it are the hypothetical dread spectres of non-determinism and
> information leaks.
Each of these reasons seems to be valid to me.
More information about the es-discuss