Non-extensibility of Typed Arrays

Filip Pizlo fpizlo at apple.com
Wed Sep 4 08:11:14 PDT 2013


On Sep 4, 2013, at 7:55 AM, Andreas Rossberg <rossberg at google.com> wrote:

> On 4 September 2013 16:44, Filip Pizlo <fpizlo at apple.com> wrote:
>>> On Sep 4, 2013, at 3:05 AM, Andreas Rossberg <rossberg at google.com> wrote:
>>> As part of binary data, typed arrays are implicitly constructed "on
>>> the fly" as views on a backing store. Any notion of identity -- which
>>> is the prerequisite for state -- is not particularly meaningful in
>>> this setting.
>> 
>> Are you proposing changing how == and === work for typed arrays?  If not then this whole argument is moot.
> 
> No, they are just rather useless operations on data views. That
> doesn't make the argument moot.

The point is that as soon as you're using the copy '=' on binary data fields, you're already losing an observable notion of object identity.  The '=' here is already unlike the '=' operator for languages that have true value types - in those languages you wouldn't be able to observe if you got the *same* typed array or a different one but with the same underlying data.  In JS you will be able to observe this with '==' and '==='.  Hence, being able to also observe that you got a different one because you lost some meta-data (i.e. custom named properties) doesn't change the fact that the quirky semantics were already observable to the user.

> 
>>> Also, it is preferable to make them as lightweight as
>>> possible.
>> 
>> See my previous mail. You gain zero space and zero performance from making typed arrays non extensible.
> 
> I think you are jumping to conclusions. You can very well optimize the
> representation of typed arrays if they don't have user-defined
> properties. Whether that's worth it I can't tell without experiments.

I don't think this is a matter of opinion.  There is state that typed arrays are required to store but that is not accessed on the most critical of hot paths, which naturally allows us to play displaced pointer tricks.

It would also be useful, if you want to argue this point, if you replied to my previous discussion of why there is no performance difference between expandable and non-expandable typed arrays.  I'll copy that here in case you missed it:

A typed array *must* know about the following bits of information:

T: Its own type.
B: A base pointer (not the buffer but the thing you index off of).
L: Its length.

But that only works if it owns its buffer - that is it was allocated using for example "new Int8Array(100)" and you never used the .buffer property.  So in practice you also need:

R: Reserved space for a pointer to a buffer.

Now observe that 'R' can be reused for either a buffer pointer or a pointer to overflow storage for named properties.  If you have both a buffer and overflow storage, you can save room in the overflow storage for the buffer pointer (i.e. displace the buffer pointer into the property storage).  We play a slightly less ambitious trick, where R either points to overflow storage or NULL.  Most typed arrays don't have a .buffer, but once they get one, we allocate overflow storage and reserve a slot in there for the buffer pointer.  So you pay *one more* word of overhead for typed arrays with buffers even if they don't have named properties.  I think that's probably good enough - I mean, in that case, you have a freaking buffer object as well so you're not exactly conserving memory.

But, using R as a direct pointer to the buffer would be a simple hack if we really felt like saving one word when you also already have a separate buffer object.

I could sort of imagine going further and using T as a displaced pointer and saving an extra word, but that might make type checks more expensive, sometimes.

So lets do the math, on both 32-bit and 64-bit (where 64-bit implies 64-bit pointers), to see how big this would be.

32-bit:

T = 4 bytes, B = 4 bytes, L = 4 bytes, R = 4 bytes.  So, you get 16 bytes of overhead for most typed arrays, and 20 if you need to use R as an overflow storage pointer and displace the buffer pointer into the overflow storage.

64-bit:

T = 8 bytes, B = 8 bytes, L = 4 bytes, R = 8 bytes.  This implies you have 4 bytes to spare if you want objects 8-byte aligned (we do); we use this for some extra bookkeeping.  So you get 32 bytes of overhead for most typed arrays, and 40 if you need to use R as an overflow storage pointer and displace the buffer pointer into the overflow storage.

As far as I can tell, this object model compresses typed arrays about as much as they could be compressed while also allowing them to be extensible.  The downside is that you pay a small penalty for typed arrays that have an "active" buffer, in the case that you either accessed the .buffer property or you constructed the typed array using a constructor that takes a buffer as an argument.

So, how big are your non-expanddable typed arrays, and what do they look like?  If they're not smaller than 16 bytes in the common case with 32-bit pointers, or 32 bytes in the common case with 64-bit pointers, then there is no performance argument in favor of getting rid of expandable properties.

-Filip


> Admittedly, it's a minor point.
> 
> /Andreas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130904/e689b2ce/attachment.html>


More information about the es-discuss mailing list