WeakMap not the weak needed for zombie views

Katelyn Gadd kg at luminance.org
Mon Jul 7 01:41:35 PDT 2014


Jussi, one thing about your (totally correct) statements here is that
you're addressing this from the perspective of 'I want to observe GC
reliably from user code'. But that's not really what is desired in
most cases.

For example, forcing a GC whenever iterating the weakmap would ensure
you don't get a reference to an 'effectively dead' object, but nobody
is likely to want that in most cases. The point isn't that you don't
process objects that are *about* to die; the point is that weakrefs
ensure that the GC can collect these object graphs that otherwise form
uncollectable cycles.

Once the GC can collect them, the additional layer you want on top is
that collections like a weakmap don't expose *dead* - already
collected - objects to the user. It's fine if iteration yields an
object that is about to be collected; in fact, it is probably good if
it does. Anyone making use of weak references should, as a cost of
entry, expect nondeterminism. The point is *specifically* that
weakrefs ensure the GC can collect your object graphs, and they allow
you to respond correctly once a graph is collected.

This is also why the 'use lifecycle hooks and/or manual reference
clearing instead' solution isn't an alternative to weakrefs. It solves
some use cases that you might otherwise solve with weakrefs, but it
does so at the cost of considerable manual effort (and bugs/leaks when
manual lifetime management is done incorrectly). For the use cases
that can't be reliably solved by manual lifetime management, you still
need weakrefs.

Similarly, it's important to realize that while some use cases for
weakrefs are about managing native resources or doing other 'automatic
cleanup' behaviors, many use cases are simply about ensuring that the
GC can free up large graphs of dead objects *as soon as memory
pressure strikes* instead of waiting until the (likely fragile or
slow) user-space collector gets around to running and collecting
user-space objects. Memory pressure is something the browser and JS VM
have knowledge of that userspace doesn't know about - if a graph is
effectively dead but can't easily have its references cleaned up
automatically (as can happen in complex object layouts, where you
would normally use refcounting or some other mechanism), it's possible
it could remain 'alive' for a long period of time without weakrefs,
eating up valuable heap, moving between GC generations, and slowing
GCs.

WRs also enable safe interaction with third-party JS that isn't
generally possible otherwise. This occurred to me the other day after
I suggested the idea of a user-space JS collector that walks the
visible JS heap from roots - you can't walk closures, so any JS object
held in a closure would escape the sight of your collector (there are
other problems, but this is the most obvious one).

Closures are used heavily in modern JS, and have the ability to retain
references to a JS object. It becomes non-trivial to figure out the
lifetime of a given closure and know when you need to manually release
any resources it relies on, whether a graphics context or a big buffer
in an asm.js heap. For a simple use case like a setInterval handler,
you can manually clean up when removing the setInterval handler, but
what if you have 3 different event listeners that all hold a reference
to that resource in their closure? How do you clean those up at the
appropriate time? The only vaguely reliable answer here is 'every
consumer of my library has to painstakingly increment/decrement
reference counts any time they retain a reference to my objects',
which is not just tedious but extremely easy to mess up. This is
further complicated by the fact that currently V8 and SpiderMonkey
closures have the ability to capture references to values that are
never actually used within the function, so JS that seems like it
shouldn't retain an object actually retains it.

As before, manual lifetime management - where possible - is king, but
there are far too many scenarios where it's either near-impossible or
far too difficult to make it your only option. A combination of manual
lifetime management + weakrefs for corner cases is the ideal approach
here (and is in fact the approach used in some desktop scenarios), in
my opinion. If we want to have robust, widespread manual lifetime
management, people will either need to adopt non-JS languages that
compile to JS (ensuring that all the elaborate lifetime management
rules are followed), or JS needs to expose construct(s) to simplify
lifetime management (C#-style using, python-style 'with resource'
blocks, C++ scoped RAII). Even then, doing it in user space still
requires all the JS running in your application to conform to these
rules - once you pull in third party code, or run user scripts, your
lifetime management is vulnerable to leaks if that outsider doesn't
carefully follow the rules.

On Mon, Jul 7, 2014 at 1:19 AM, Jussi Kalliokoski
<jussi.kalliokoski at gmail.com> wrote:
> To first address the particular case of using weak maps for custom event
> listeners via iteration:
>
> I think the only relatively sane approach to iterating a WeakMap would be to
> force GC whenever the WeakMap is being iterated. This would make sure that
> you couldn't get references to items that are about to be garbage-collected
> (and thus don't also introduce non-deterministic errors and memory leaks for
> event listeners firing on disposed views). However, this would make
> iterating a WeakMap potentially unbearably slow and thus not worth using for
> this case. The performance hit may be tuned down by traversing the reference
> tree only from the items contained in the WeakMap, but I'm not sure if
> that's feasible and it would probably also make the performance worse if the
> WeakMap is large enough and has a lot of resources that are alive. Another
> drawback is that this would potentially lead to abuse where for example all
> views would be stored in a WeakMap and then the WeakMap would be iterated
> through just to force GC on the views.
>
> On the discussion thread linked, it's also discussed that weakrefs would be
> used for DOM event listeners, but I'm not exactly sure if that's a very
> workable solution either. You'll basically get a weak reference locally, but
> the DOM event listener will still hold a strong reference to the function.
> You could of course add a weak addEventListener variant, but soon you'd
> notice that you also need a weak setTimeout, setInterval,
> requestAnimationFrame, Object.observe and maybe even weak promises. :/
>
> All in all, I'm doubtful that weak references can solve the use cases
> presented very well. They would basically encourage people to start building
> frameworks that use weakrefs instead of lifecycle hooks only to notice that
> there's some part of the platform where they need manual reference clearing
> anyway. The solution, I think, is to just use frameworks and libraries like
> angular and react that provide these lifecycle hooks and take care that
> these hooks are triggered for you, instead of having to manually call a
> destroy method.
>
> Cheers,
> Jussi
>
>
>
> On Mon, Jul 7, 2014 at 4:49 AM, Boris Zbarsky <bzbarsky at mit.edu> wrote:
>>
>> On 7/6/14, 4:11 PM, Filip Pizlo wrote:
>>>
>>> My reading of the linked Mozilla discussions seems to be that some GC
>>> implementors think it's hard to get the feature right
>>
>>
>> I'm not sure how you can possibly read
>> https://groups.google.com/forum/#!msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/hLJiNqd8Xq8J
>> that way.  That post isn't even from a GC implementor and says nothing about
>> implementation issues!
>>
>> I think that post presents the strongest argument I know against the "use
>> GC to reclaim your non-memory resources" argument, and the summary is that
>> while that approach looks promising at first glance in practice it leads to
>> resources not being reclaimed when they should be because the GC is not
>> aiming for whatever sort of resource management those particular resources
>> want.
>>
>> -Boris
>>
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>
>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>


More information about the es-discuss mailing list