Removal of WeakMap/WeakSet clear
rossberg at google.com
Fri Nov 28 04:21:05 PST 2014
On 27 November 2014 at 19:40, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:
> On Nov 27, 2014, at 1:17 AM, Andreas Rossberg wrote:
>> The discussion here still makes various untested assumptions about the
>> transposed representation. With respect to .clear in particular, it
>> seems to assume that this representation does not normally need the
>> ability to iterate (internally) over the keys of a weak map. AFAICT,
>> that is incorrect. Both the implementation we discussed in the V8
>> team, and from what I heard, the implementation in IE, maintain a list
>> of all its keys for each weak map. Otherwise, when a map dies, you
>> couldn't generally kill the entries and the values associated with
>> them easily and in a timely manner. (And of course, this list means
>> almost twice the space usage for a map, because you duplicate part of
>> the information.)
>> So from that perspective at least, .clear is not an issue.
> Since I made this assertion that 'clear' interfere with with an inverted implementation of WeakMap/Set I'll explain what I'm thinking.
> First, here is my design for an inverted implementation:
> Every object internally maintains a table (probably a hash table if it contains more than a few elements) that is used to implement WeakMap/Set. Entry in the table is a key/value pair where the key is a WeakMap/Set instance. Values are arbitrary ES values. Let's call such a table an "inverted map" and generically refer to such WeakMaps/Sets as WCs.
> A WC object has no instance specific state (other than the state required of all objects, such as [[Prototype]] etc.
> The WeakMap 'set' method applied to a WeakMap M with arguments k and v works by accessing the inverted map of k and creating an entry in the inverted map with key: M and value: k (or updating the value field with v with an entry for M already exists).
> WeakMap get/set/has/delete work similarly by accessing keys' inverted map using M as the key.
> Note that since such a WC instance does not contain any references to its keys (or values) it does not prevent an object that is used as one of its keys from being garbage collected. When an object is GC'd its inverted map is, of course, also discarded.
> There is no change in the GC latency of an object uses as a WeakMap key. If the only reference to an object used as a value in a Weakmap is from a single inverted map entry, then that value object should be GC's when its key value is GC'ed.
> So far great, none of this requires anything special from the GC. But what if there are no references to a WC instance other than from the keys of inverted maps? In that case the WC instance should be considered garbage, but if the inverted map keys are normal strong object references such should-be-garbage WC instance won't be recognized as such. (This might not be a very big deal if we were only talking about the WC instances as they are small and presumably not very numerous, but retaining those instances also cause any corresponding inverted map value references to be retained).
With a normal ephemeral weak map implementation, like the one in V8,
the value in a (map,key,value) triple can be reclaimed _immediately_
(in the same cycle) when _either_ the map _or_ the key dies. In the
transposed implementation you describe that is not the case. If the
map dies, the value will survive another (major GC) cycle. Under
memory pressure that can increase the rate of major GCs.
To achieve timely reclamation in the former implementation, the GC
maintains a list of all live weak maps, and traverses it during the
atomic pause in the final marking phase (in an incremental GC). If you
wanted to do the analog in the transposed representation, then the GC
would dually need to maintain a list of all _keys_, and always
traverse all of that. That would clearly be very expensive (there are
typically far more keys than maps). The more practical solution is to
have one such list per weak map, so that you only iterate over
relevant keys of maps that have died. Alas, every map gets a list of
If you don't want to do that, then you'll have inferior GC behaviour
with the transposed representation. In addition, you'll have to
implement the whole weak property mechanism, and make every access to
a key/value pair require an indirection through such a weak reference.
> We can fix this by having the GC treat all key entries of inverted maps as weak references. Note that this requires weak reference support in the GC (definitely a complication) but does not require that the GC implement ephemeron semantics (a bigger complication).
At least in V8, we have various internal uses of ephemeral data
structures anyway. In contrast, there is only one use of weak
references, which is for persistent handles in the API, and we have
plans to get rid of that. ;)
But more to the point, weak properties are far more complicated than
just weak references. At least if you want to avoid substantial
secondary costs for object accesses (see my mail from a while ago).
> Note that nothing above requires (or allows) enumerating the entries of a WC. Explicit enumeration operations or a 'clear' method would require some way to enumerate the objects that are used as keys of a WC.
> This is the end of my assumed inverted WC design and why I assert that a clear method is incompatible with it.
> Lets explore how we can extend this design to support iteration (of the sort need to implement 'clear'). To do iteration we need to have some way to find all of the objects that are used as keys of a WC. Since were are using an inverted representation (and assuming we don't want to examine all objects to see if the WC is a key in its inverted map). We need a way to efficiently find all objects that are used as key of a particular WC. The obvious way to do that would be to add to the internal state of a WC a list of all active key values. This would need to be a list represented using weak references. Otherwise, it would prevent key objected from being properly GC'ed. So, at the very least, clear or other iteration requires an additional weak reference list for each WC (or perhaps a likely less efficient single global weak list) that would not otherwise be needed.
More information about the es-discuss