Removal of WeakMap/WeakSet clear

Katelyn Gadd kg at luminance.org
Thu Dec 4 06:25:47 PST 2014


There are scenarios where both security and performance matter. I
think this is more than self-evident at this point in the thread since
examples of both have been provided repeatedly. 'can you demonstrate
there are no such scenario' isn't really a necessary question because
we already know the answer: no.

That's not the relevant issue here, though. As discussed before, the
basic choice at hand here is whether to optimize the common case for
security or for reasonable performance & usability.

The security use case can be addressed by wrapping the weakmap in an
opaque object that revokes the ability to clear (along with the
ability to write, for example - an ability I think you would want to
deny in most security-sensitive scenarios.) It is true, as you state,
that wrapping the weakmap imposes a performance cost; however, it
seems unlikely that the cost would be more than negligible, given that
the wrapper object would literally consist of one property containing
an object reference, and its methods could be trivially inlined by
every JS runtime I'm aware of.

The performance use cases can theoretically be addressed by a
sufficiently smart runtime, even if a .clear() method is not present.
However I would argue that the difficulty of *actually* providing good
performance for these use cases without a .clear() method is extremely
high, for example:

Without a clear() method you have to wait for a sufficiently
exhaustive collection to be triggered, likely by memory pressure. If
the values being held alive in the interim do not merely use JS heap -
for example, they are webgl textures or bitmap images - it is likely
that the memory pressure feedback may not be sufficient to trigger an
exhaustive collection soon enough. I have seen this exact issue in the
past (while using weakmap, actually!)

It was previously stated that 'secure by default' is a noble goal, and
I agree. However, in this case secure-by-default is not something a
user will expect from JS containers, because no other JS data
structure offers secure-by-default. WeakMap as currently specced -
with or without clear() - is also not secure by default since writes
can still occur to values. You would need to disable writes by default
as well, somehow.

On a performance note, I would also argue that it seems profoundly
questionable that a transposed weak map implementation can provide
consistently good performance out in the real world for typical use
cases. I am certain that there *are* use cases where it is optimal,
and it clearly has its advantages, but as someone who spends absurd
amounts of time tuning the performance of software - both JS and
native - the design of a transposed weakmap contains many red flags
that suggest bad performance. I will speculate, based on my
understanding of transposed weak maps and my (incomplete) knowledge of
modern JS runtimes - please correct any errors:

The transposed weak map must store values as hidden properties on the
keys used. This means that any object used as a key - any object
reference, that is - must be able to accept hidden properties. This
means that it is effectively impossible to allocate object instances
with fixed-size, fixed-layout storage unless you reserve space for a
place to store the weakmap values. The only way I can imagine to solve
this is to make really aggressive use of type information gathering
and/or bailouts in the runtime to identify every type used as a
weakmap key - at which point I suppose you would have to convert their
memory layout on the heap in order to ensure consistency, or support
two different memory layouts for the same type.

I don't consider the above an academic concern, either: Dense memory
layout is essential if you want good locality (and thus, good cache
efficiency) and if you want the ability to cheaply do things like copy
your instances into a typed array and upload them onto the GPU. The
cheap copying use case will matter a lot once typed objects are
introduced since they are all about fixed, dense memory layout and
cheap copying.

A transposed weakmap generally implies poor memory locality, extensive
pointer-chasing, and higher memory overhead for each key/value pair
stored in the map.

If I'm not mistaken, a transposed weakmap may also increase the cost
of GC tracing overall, or at least for any object type that can be
used as a key - the requirement to allocate space for weakmap values
on those types means that the GC must now trace those weakmap value
slots regardless of whether they actually contain a value.

A transposed weakmap probably also implies worse memory fragmentation
or more wasted heap, because you either have to lazily allocate the
space for the weakmap values (which means a separate heap allocation)
or reserve empty space in all instances for the values. Neither of
these feel particularly ideal.

A transposed weakmap may also imply hindrances to a VM's ability to
elide heap allocations or store JS object instances on the stack/in
registers, by increasing the minimum size of instances and making them
variable-size where they were previously fixed-size. It is unclear to
me how often these optimizations currently happen, but they are
certainly something that will be desired once typed objects are in
use, and SIMD use cases absolutely demand them.

While transposed weakmaps are an interesting representation with clear
advantages, I think the risks suggest that perhaps it is not wise to
assume that all implementations will use transposed weakmaps, and that
perhaps it is unwise to optimize for that layout. Andreas's stated POV
in this thread makes me feel more confident in my opinion on this
subject.

So, to summarize: It's necessary to pick which of the two concerns to
optimize for with this design. They are both legitimate concerns. I
would fervently argue that the risks/complications involved in
optimizing for security dramatically outweigh the risks involved in
optimizing for performance.

I should also note that while much of the above is speculative and
based on intuition/experience, I *have* been shipping a use of WeakMap
for performance-critical code for over a year now (though naturally, a
fallback is required in browsers without WeakMap - basically
everything other than Firefox.) WeakMap offers real advantages there
and my use case actually does benefit from an efficient, built-in
clear primitive. I care about WeakMap being performant and I suspect
once it is widely available in browsers, more applications will use it
and their authors will also care about performance.

Some of the performance penalties I describe above should make people
especially wary since they will be difficult to debug. Subtle changes
to memory layout and GC overhead are virtually impossible to debug
with existing JS performance analysis tools - identifying those issues
would require extensive knowledge of VM internals.

-kg

On 4 December 2014 at 04:58, David Bruant <bruant.d at gmail.com> wrote:
> Le 04/12/2014 09:55, Andreas Rossberg a écrit :
>>
>> On 4 December 2014 at 00:54, David Bruant <bruant.d at gmail.com> wrote:
>>>
>>> The way I see it, data structures are a tool to efficiently query data.
>>> They
>>> don't *have* to be arbitrarily mutable anytime for this purpose.
>>> It's a point of view problem, but in my opinion, mutability is the
>>> problem,
>>> not sharing the same object. Being able to create and share structured
>>> data
>>> should not have to mean it can be modified by anyone anytime. Hence
>>> Object.freeze, hence the recent popularity of React.js.
>>
>> I agree, but that is all irrelevant regarding the question of weak
>> maps, because you cannot freeze their content.
>
> The heart of the problem is mutability and .clear is a mutability
> capability, so it's relevant. WeakMap are effectively frozen for some
> bindings if you don't have the keys.
>
>> So my question stands: What would be a plausible scenario where
>> handing a weak map to an untrusted third party is not utterly crazy to
>> start with?
>
> Sometimes you call functions you don't have written and pass arguments to
> them. WeakMaps are new, but APIs will have functions with WeakMaps as
> arguments. I don't see what's crazy. It'd be nice if I don't have to review
> all NPM packages I use to make sure they dont use .clear when I pass a
> weakmap.
> If you don't want to pass the WeakMap directly, you have to create a new
> object "just in case" (cloning or wrapping) which carries its own obvious
> efficiency. Security then comes at the cost of performance while both could
> have been achieved if the same safe-by-default weakmap can be shared.
>
>> In particular, when can giving them the ability to clear
>> be harmful, while the ability to add random entries, or attempt to
>> remove entries at guess, is not?
>
> I don't have an answer to this case, now.
> That said, I'm uncomfortable with the idea of seeing a decision being made
> that affects the language of the web until its end based on the inability of
> a few person to find a scenario that is deemed plausible by few other
> persons within a limited timeframe. It's almost calling for an "I told you
> so" one day.
> I would return the question: can you demonstrate there are no such scenario?
>
> We know ambiant authority is a bad thing, examples are endless in JS.
> The ability to modify global variable has been the source of bugs and
> vulnerabilities.
> JSON.parse implementations were modified by browsers because they used
> malicious versions of Array as a constructor which led to data leakage.
> WeakMap.prototype.clear is ambiant authority. Admittedly, its effects are
> less broad and its malicious usage is certainly more subtle.
>
> David
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss


More information about the es-discuss mailing list