Non-extensibility of Typed Arrays

Filip Pizlo fpizlo at apple.com
Wed Sep 4 19:29:14 PDT 2013


On Sep 4, 2013, at 5:25 PM, Brendan Eich <brendan at mozilla.com> wrote:

> Filip Pizlo wrote:
>>>> Typed arrays have both of these properties right now and so expandability is a free lunch.
>>> 
>>> The last sentence makes a "for-all" assertion I don't think implementations must be constrained by. 
>> 
>> How so? It is true that some VM implementations will be better than others. But ultimately every VM can implement every optimization that every other VM has; in fact my impression is that this is exactly what is happening as we speak.
> 
> My "for-all" referred to all typed arrays across all VMs, not just all VMs.
> 
> Also just as a point of fact (something "done", the Latin root means "deed"), I do not see the same optimizations being used in all VMs. For example, SpiderMonkey's TI (written up here: http:// rfrn.org/~shu/drafts/ti.pdf‎ [PLDI 2012]) is not being used elsewhere AFAIK -- please correct me if I'm mistaken.

Interesting point.  Equivalent optimizations are being done.  Other VMs also infer types one way or another.  And I'd argue that my way of inferring types is the best - it incurs smaller overheads for start-up while achieving more precise results.  (Of course I must say that - I stand by my stuff, heh.)  That being said, I do think that FF's TI is really cool and loved reading that paper.

It's kind of like in JVMs, all of the big-league ones did speculative inlining - but they do it in radically different ways and rely on different kinds of feedback and if you go to a conference where JVM hackers show up, they will argue about which is best.  I have fond memories of Sun vs. IBM vs. Oracle shouting matches about how you do deoptimization, whether you do deoptimization at all, and what you need to analyze and prove things about the class hierarchy.  That doesn't change the basics: they all do speculative inlining and it performs sort of the same in the end.

I suspect that the same thing is becoming true of typed arrays, regardless of whether they are extensible or not.  I guess that when I said "every optimization that every other VM has" I didn't mean literally using the same exact algorithm - just performing optimizations that achieve equivalent results.

> 
>> So, it doesn't make much sense to make language design decisions because it might make some implementor's life easier right now. If you could argue that something will /never/ be efficient if we add feature X, then that might be an interesting argument. But as soon as we identify one sensible optimization strategy for making something free, I would tend to think that this is sufficient to conclude that the feature is free and there is no need to constrain it. If we don't do this then we risk adding cargo-cult performance features that rapidly become obsolete.
> 
> I agree that's a risk. I'm also with Niko in wanting to argue about what the semantics should be without appealing to performance arguments.

Right!  I guess my first order argument is that performance *isn't* an argument in favor of non-expandability.

> 
> However, I still think you are verging on promising a free lunch. All methods in C++ cannot affordably be virtual. Expandos in JS cost. At fine enough grain, even pretty-well-predicted branches cost. Nothing is free-enough to discount forever in my bitter and long experience :-P.

I am promising a free lunch!  Virtual methods in C++ are only expensive because C++ still doesn't have feedback-driven optimization.  JVMs make them free in Java.  And they are free.  Period.  There is no upside to marking a method final in Java.  I am arguing that expandos are similar.

> 
>>> The lack of static types in JS does not mean exactly one implementation representation must serve for all instances of a given JS-level abstraction. We already have strings optimized variously in the top VMs, including Chords or Ropes, dependent strings, different character sets, etc.
>>>> 
>>>> Still find this discussion amusing? Here's the long story is: It is these things that I list above that lead to a 16 byte overhead on 32-bit, and a 32-byte overhead on 64-bit in the best "sane" case. Giving typed array objects expandability doesn't add to this overhead, because two of the fields necessary to implement the above (the type, and the buffer) can be displaced for pointing to property storage. Any imaginable attempt to reduce the overhead incurred by the information - using BBOP (big bag of pages) for the type, using an out-of-line weak map for the buffer or the type, encoding some of the bits inside the pointer to the typed array, etc. - can be also used to eradicate any space overhead you'd need for custom properties, so long as you're on board with the "free if unused, sub-optimal if you use them" philosophy.
>>> 
>>> For something like decimal, it matters whether there's an empty side table and large-N decimal instances of total size N*S, vs. N*(S+K) for some constant K we could eliminate by specializing harder. Even better if we agree that decimal instances should be non-extensible (and have value not reference semantics -- more below).
>> 
>> With a side table, the constant K = 0 even if you have custom properties. The table will only have an entry for those instances that had custom properties.
> 
> I know, that's why I was attacking the non-side-table approach.
> 
> But the side table has its own down-side trade-offs: GC complexity, even costlier indirection, and strictly greater implementation complexity. If one could implement without having to mess with this K ?= 0 design decision and hassle with packing or else using a side-table, one's VM would be smaller, simpler, less buggy -- all else equal.

Meh, I'm just reusing the GC complexity that the DOM already introduces.

> 
> Now you may say that I'm betraying my hero Mr. Spock, whom I have invoked to argue that implementors should sacrifice so the mass of JS users can live long and prosper.

Yes, you are. ;-)

> 
> And you'd have me dead to rights -- if I thought JS users wanted expandos on binary data, that the lack of expandos there was a problem akin to the whole starship being blown up. But I do not believe that's the case.
> 
> If users don't care, then implementors should get a break and VMs should be simpler, ceteris paribus.

Fair enough.

> 
>>>> - If the VM wants to go further and create immediate representations of some or all Int64's, similarly to what VMs do for JS small integers today, then the main problem you run into is object identity: does Int64(1).add(Int64(1)) == Int64(1).add(Int64(1))? A naive JS implementation of an Int64 class would say that this is false, since it's likely to allocate a new Int64 each time. But an immediate representation would have no choice but to say true. You can work around this if you say that the VM's implementation of Int64 operations behaves /as if/ the add()/sub()/whatever() methods used a singleton cache. You can still then have custom properties; i.e. you could do Int64(2).foo = 42 and then Int64(1).add(Int64(1)).foo will return 42, since the VM can keep an immediate-int64-to-customproperties map on the side. That's kind of analogous to how you could put a setter on field '2' of Array.prototype and do some really hilarious things.
>>> 
>>> The value objects proposal for ES7 is live, I'm championing it. It does not use (double-dispatch for dyadic) operators as methods. It does not use extensible objects.
>>> 
>>> http://wiki.ecmascript.org/doku.php?id=strawman:value_objects
>>> http://www.slideshare.net/BrendanEich/value-objects
>>> 
>>> Warning: both are slightly out of date, I'll be updating the strawman over the next week.
>> 
>> Thanks for the links! To clarify, I'm not trying to make a counterproposal - the above was nothing more than a fun thought experiment and I shared it to motivate why I think that custom properties are free.
>> 
>> My understanding is that you are still arguing that custom properties are not free, and that they incur some tangible cost in terms of space and/or time. I'm just trying to show you why they don't if you do the same optimizations for them that have become acceptable for a lot of other JS corners. Unless you think that ES should have an "ease of implementation" bar for features. I wouldn't necessarily mind that, but my impression is that this is not the case.
> 
> I do think implementor ease, or really implementation simplicity, should be a concern. It's secondary, per Spock's Kobayashi Maru solution, to the needs of the many JS users. But it's not nothing. Part of the impetus for Dart, I'm told, is the complexity of V8 required by JS-as-it-is. Whatever the case there, standardized JS extensions should not add too much complexity if we can help it.
> 
> I'll lay off performance concerns but you'll still see me, like Ahab lashed to the white whale, beckoning against free lunch arguments or anything near them :-P.

My job is to give people a free lunch in the performance department.  So I live by free lunch arguments.

> 
>>> With value objects, TC39 has definitely favored something that I think you oppose, namely extending JS to have (more) objects with value not reference semantics, which requires non-extensibility.
>> 
>> Indeed.
>> 
>>> 
>>> If I have followed your messages correctly, this is because you think non-extensibility is a rare case that should not proliferate. 
>> 
>> I have two points here:
>> 
>> - Typed arrays already have so much observable objectyness that making then non-extensible feels arbitrary; this is true regardless of the prevalence, or lack thereof, of non-extensibility.
> 
> Ok, I acknowledge this point.
> 
> And yet SpiderMonkey had native typed arrays from the get-go, non-extensible -- we didn't use WebIDL. So the interoperable intersection semantics developers can count on does not include extensibility. As Mark says, this allows us to standardize either way, so we need arguments that don't appeal to "feelings".

This is a good point.

> 
>> - At the same time, I do think that non-extensibiltiy is a rare case and I don't like it.
> 
> I can tell ;-). Feelings are important but to decide on a spec we will need stronger reasons.

I agree.  I'm assuming that in the grand scheme of things, specs improve when people articulate gut feelings and we reach something in the middle.

> 
>>> But with ES5 Object.preventExtensions, etc., the horse is out of the barn.
>> 
>> It's there and we have to support it, and the fact that you can do preventExtensions() to an object is a good thing. That doesn't mean it should become the cornerstone for every new feature. If a user wants to preventExtensions() on their object, then that's totally cool - and I'm not arguing that it isn't.
>> 
>> The argument I'm making is a different one: should an object be non-expandable by default?
>> 
>> I keep hearing arguments that this somehow makes typed arrays more efficient. That's like arguing that there exists a C compiler, somewhere, that becomes more efficient if you label your variables as 'register'.
> 
> I remember when that indeed mattered.
> 
>> It's true that if you're missing the well-known optimization of register allocation then yes, 'register' is an optimization. Likewise, if you're missing the well-known object model optimizations like pointer displacement, BBOP's, or other kinds of side tables, then forcing objects to be non-extensible is also an optimization. That doesn't mean that we should bake it into the language. VM hackers can just implement these well-known optimizations and just deal with it.
> 
> Ok, let's let the performance argument rest. You can be Ishmael and live. I'm Ahab and I still stab at such nearly-free-lunch, "sufficiently smart compiler" claims :-).
> 
>>> At a deeper level, the primitives wired into the language, boolean number string -- in particular number when considering int64, bignum, etc. -- can be rationalized as value objects provided we make typeof work as people want (and work so as to uphold a == b && typeof a == typeof b <=> a === b).
>> 
>> I think making int64/bignum be primitives is fine. My only point is that whether or not you make them expandable has got nothing to do with how much memory they use.
>> 
>>> 
>>> This seems more winning in how it unifies concepts and empowers users to make more value objects, than the alternative of saying "the primitives are legacy, everything else has reference semantics" and turning a blind eye, or directing harsh and probably ineffective deprecating words, to Object.preventExtensions.
>> 
>> Well this is all subjective. Objects being expandable by default is a unifying concept.
> 
> It does not unify number, boolean, string.

True.

> 
> What's not subjective is that we have two concepts in JS today, one (ignoring null and undefined) for primitive AKA value types, the other for reference types (objects). I see a way to extend object as a concept to subsume value types, although of course unity comes at the price of complexity for object. But non-extensibility is a piece of complexity already added to object as a concept by ES5.
> 
> Irreducible complexity here, and perhaps "subjective" or (I prefer) "aesthetic" judgment is the only way to pick.

Is it clear that we *can't* have a better story for value types?  I just don't think that non-extensibility is sufficient.

OK so lets back up.  Do you believe that making an object non-extensible is sufficient to make it a "value type"?  I don't.  You probably need some other stuff.

This is where I return to the objectyness point: typed arrays are already spec'd to have a bunch of heavy reference-to-object behavior.  So making then expandable is no big deal.  And making then non-expandable means that we'll now live in a weirdo world where we have four different concepts of what it means to be a value:

A) Full blown reference objects that you can do weird things to, like add properties and change __proto__, etc.  You can also make one non-extensible at your discretion, which fits into the bat-poop crazy "you can do anything" philosophy of full blown objects.  And that's great - that's the heart of the language, and I happen to enjoy it.

B) Object types that are always non-extensible but otherwise still objecty - they have a prototype that is observable, they reveal their identity via ==, and you can actually inject stuff into them by modifying the appropriate Object.prototype.

C) Values with whatever value type semantics we come up with in the future.

D) Primitives.

Now, I hope that we could get C and D to be as close as possible to each other.  But that still leaves three different behaviors.  This introduces a learning curve.  That's why (B) offends me.  It's subtly different from (A) and clearly different from either (C) or (D).

Now, we actually also have a totally alternate behavior, used by binary data.  And my argument there is that I wouldn't get too offended by binary data acting weird, because the very notion of exposing binary data is weird to begin with.  I expect it to be used only for special graphicsy stuff and not for general-purpose "value types" for normal JS programs.  So it's OK to me if binary data is both weird and inconsistent with everything else.  And no, I still don't view "typed arrays" as being part of binary data - it already appears to be the case that typed arrays have different buffer behavior to the struct types.  So they're just different.  And that's fine.

> 
>> The only thing that expandability of typed arrays appears to change is the interaction with binary data - but that isn't exactly a value object system as much as it is a playing-with-bits system. I'm not sure that having oddities there changes much.
> 
> Sure, let's get back to binary data (I brought up value objects because you brought up int64).
> 
> Interior binary data objects will be cons'ed up upon extraction, so distinguishable by == returning false and by lack of expando preservation. Niko, Dmitry, and others take this as a sign that expandos should not be allowed, leaving only == returning false among same-named extractions as an oddity. And they further conclude that expandos should not be allowed on any binary data object (whether interior extracted, or not).
> 
> You argue on the contrary that JS objects in general can be extended with expandos, so why restrict binary data objects, even interior ones that are extracted? Let each such extracted interior object be != with all other same-named extractions, and let each have expandos assigned that (vacuously) won't be preserved on next extraction.
> 
> I hope I have stated positions accurately.

Yup!

> If so I'll tag out of the ring, in hopes of someone else bringing new arguments to bear.
> 
> /be

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130904/c2bc1ddc/attachment-0001.html>


More information about the es-discuss mailing list