Non-extensibility of Typed Arrays

Filip Pizlo fpizlo at apple.com
Wed Sep 4 12:38:39 PDT 2013


On Sep 4, 2013, at 12:17 PM, Niko Matsakis <niko at alum.mit.edu> wrote:

> I think Filip is right that given sufficient cleverness extensible
> properties for typed objects can be implemented efficiently. The real
> question is what the behavior SHOULD be. As others have pointed out,
> we are not forced to support extensible properties for web compat
> reasons.
> 
> I also think it is very important and useful to have typed objects be
> a generalization of typed arrays. I suspect nobody wants an "almost
> but not quite the same" set of array types. It'd be my preference that
> (eventually) the "specification" for typed arrays can just be "var
> Uint16Array = new ArrayType(uint16)", which I believe is currently
> plausible.
> 
> In light of this consideration, that means that adding exensible
> properties to typed arrays means adding extensible properties to all
> "typed objects that are arrays" (that is, instances of some type
> defined by `new ArrayType()`).
> 
> As Dmitry pointed out, extensible properties is only possible for
> "top-level" objects. I think this results in a surprising and
> non-composable spec.
> 
> The surprising behavior isn't limited to the copying example that
> Dmitry gave. Another problem is that instances of array types that are
> found embedded in other structures don't have the full capabilities of
> "top-level" instances. Without extensible properties, it is true that
> if I have a function that is given a typed object (of any kind, array
> or struct) and uses it, I can also provide it with an instance of that
> same type that is a part of a bigger structure.
> 
> For example:
> 
>    function doSomething(anArray) {
>        anArray[0] = anArray[1];
>    }
> 
>    // Invoke doSomething with top-level array
>    var TwoUint8s = new ArrayType(uint8, 2);
>    doSomething(new TwoUint8s());
> 
>    // Invoke doSomething with array that is
>    // embedded within a struct:
>    var MyStruct = StructType({a: TwoUint8s});
>    var instance = new MyStruct();
>    doSomething(instance.a);
> 
> But this no longer works if `doSomething` makes use of extensible
> properties:
> 
>    function doSomething(anArray) {
>        anArray[0] = anArray[1];
>        anArray.foo = anArray.bar;
>    }
> 
> Now the second use case doesn't work.
> 
> To me, it seems a shame to trade a simple story ("typed objects let
> you define the layout and fields of an object, full stop") for a more
> complex, non-composable one ("...except for extra fields on arrays,
> which only work some of the time").

Hi Niko,

The reason why I'm OK with the more complex story is that we already have that story for '=='.  To me, named object properties are analogous to being able to identify whether you have the same object or a different object: both are mechanisms that reveal aliasing to the user.  Having typed objects that are embedded in other ones already breaks ==.

-Filip


> 
> 
> 
> Niko
> 
> 
> On Wed, Sep 04, 2013 at 08:11:14AM -0700, Filip Pizlo wrote:
>> 
>> On Sep 4, 2013, at 7:55 AM, Andreas Rossberg <rossberg at google.com> wrote:
>> 
>>> On 4 September 2013 16:44, Filip Pizlo <fpizlo at apple.com> wrote:
>>>>> On Sep 4, 2013, at 3:05 AM, Andreas Rossberg <rossberg at google.com> wrote:
>>>>> As part of binary data, typed arrays are implicitly constructed "on
>>>>> the fly" as views on a backing store. Any notion of identity -- which
>>>>> is the prerequisite for state -- is not particularly meaningful in
>>>>> this setting.
>>>> 
>>>> Are you proposing changing how == and === work for typed arrays?  If not then this whole argument is moot.
>>> 
>>> No, they are just rather useless operations on data views. That
>>> doesn't make the argument moot.
>> 
>> The point is that as soon as you're using the copy '=' on binary data fields, you're already losing an observable notion of object identity.  The '=' here is already unlike the '=' operator for languages that have true value types - in those languages you wouldn't be able to observe if you got the *same* typed array or a different one but with the same underlying data.  In JS you will be able to observe this with '==' and '==='.  Hence, being able to also observe that you got a different one because you lost some meta-data (i.e. custom named properties) doesn't change the fact that the quirky semantics were already observable to the user.
>> 
>>> 
>>>>> Also, it is preferable to make them as lightweight as
>>>>> possible.
>>>> 
>>>> See my previous mail. You gain zero space and zero performance from making typed arrays non extensible.
>>> 
>>> I think you are jumping to conclusions. You can very well optimize the
>>> representation of typed arrays if they don't have user-defined
>>> properties. Whether that's worth it I can't tell without experiments.
>> 
>> I don't think this is a matter of opinion.  There is state that typed arrays are required to store but that is not accessed on the most critical of hot paths, which naturally allows us to play displaced pointer tricks.
>> 
>> It would also be useful, if you want to argue this point, if you replied to my previous discussion of why there is no performance difference between expandable and non-expandable typed arrays.  I'll copy that here in case you missed it:
>> 
>> A typed array *must* know about the following bits of information:
>> 
>> T: Its own type.
>> B: A base pointer (not the buffer but the thing you index off of).
>> L: Its length.
>> 
>> But that only works if it owns its buffer - that is it was allocated using for example "new Int8Array(100)" and you never used the .buffer property.  So in practice you also need:
>> 
>> R: Reserved space for a pointer to a buffer.
>> 
>> Now observe that 'R' can be reused for either a buffer pointer or a pointer to overflow storage for named properties.  If you have both a buffer and overflow storage, you can save room in the overflow storage for the buffer pointer (i.e. displace the buffer pointer into the property storage).  We play a slightly less ambitious trick, where R either points to overflow storage or NULL.  Most typed arrays don't have a .buffer, but once they get one, we allocate overflow storage and reserve a slot in there for the buffer pointer.  So you pay *one more* word of overhead for typed arrays with buffers even if they don't have named properties.  I think that's probably good enough - I mean, in that case, you have a freaking buffer object as well so you're not exactly conserving memory.
>> 
>> But, using R as a direct pointer to the buffer would be a simple hack if we really felt like saving one word when you also already have a separate buffer object.
>> 
>> I could sort of imagine going further and using T as a displaced pointer and saving an extra word, but that might make type checks more expensive, sometimes.
>> 
>> So lets do the math, on both 32-bit and 64-bit (where 64-bit implies 64-bit pointers), to see how big this would be.
>> 
>> 32-bit:
>> 
>> T = 4 bytes, B = 4 bytes, L = 4 bytes, R = 4 bytes.  So, you get 16 bytes of overhead for most typed arrays, and 20 if you need to use R as an overflow storage pointer and displace the buffer pointer into the overflow storage.
>> 
>> 64-bit:
>> 
>> T = 8 bytes, B = 8 bytes, L = 4 bytes, R = 8 bytes.  This implies you have 4 bytes to spare if you want objects 8-byte aligned (we do); we use this for some extra bookkeeping.  So you get 32 bytes of overhead for most typed arrays, and 40 if you need to use R as an overflow storage pointer and displace the buffer pointer into the overflow storage.
>> 
>> As far as I can tell, this object model compresses typed arrays about as much as they could be compressed while also allowing them to be extensible.  The downside is that you pay a small penalty for typed arrays that have an "active" buffer, in the case that you either accessed the .buffer property or you constructed the typed array using a constructor that takes a buffer as an argument.
>> 
>> So, how big are your non-expanddable typed arrays, and what do they look like?  If they're not smaller than 16 bytes in the common case with 32-bit pointers, or 32 bytes in the common case with 64-bit pointers, then there is no performance argument in favor of getting rid of expandable properties.
>> 
>> -Filip
>> 
>> 
>>> Admittedly, it's a minor point.
>>> 
>>> /Andreas
>> 
> 
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130904/b2e10094/attachment.html>


More information about the es-discuss mailing list