Non-extensibility of Typed Arrays

K. Gadd kg at luminance.org
Wed Sep 4 22:11:34 PDT 2013


Did anyone address what should be done in the use case where it's necessary
for information to 'tag along' with an array or typed array, for interop
purposes? The existence of interior binary data objects seems to complicate
this further; for example I had said that it seems like WeakMap allows
attaching information to a typed array in that case even if it isn't
extensible. If interior objects lose identity, though, it now becomes
*literally impossible* for data to follow an instance of Uint32array (or
whatever) around the runtime, which is kind of troubling. Obviously I
understand *why* this is the case for interior objects.

Is the meaning of an assignment to an interior object well specified? The
data is copied from the source typed array into the interior object, I
assume.

I'm going to describe how I understand things and from that how it seems
like they could work:
At present when you construct a typed array it is a view over a particular
buffer. You can construct an array with a size `new Uint32Array(32)` in
which case a buffer is allocated for you behind the scenes; you can
construct an array from a buffer + offset/size pair in order to create a
view over a subregion of the buffer. In both cases, the 'array' does not
actually represent or contain the data, it is merely a proxy of sorts
through which you can access elements of a particular type.
It is my understanding that this is the same for binary data types: you can
construct a heap instance of one, in which case it has an invisible backing
buffer, or you can 'construct' one from an existing buffer+offset, in which
case it is more like a proxy that represents the given data type at that
given offset in the buffer, and when you manipulate the proxy you are
manipulating the content of the buffer.

In both cases, I believe it is consistent that these objects are all
'views' or 'proxies', not actual data. The fact that you can create an
instance directly creates the *illusion* of them being actual data but in
every case it is possible for multiple instances to share the same backing
store without sharing referential identity (via ===).

In both cases, I don't believe a user should expect that attaching an
expando to one object instance should modify the expandos on another object
instance. Given this, it seems perfectly reasonable to be able to attach
expandos to a typed array, and I've previously described why this use case
is relevant (interop between compilers targeting JS, and native
hand-written JS, for one).

In the same sense, if typed arrays must be constructed to act as proxies
for the 'interior' arrays in a binary data type, being able to attach
expandos to them does not cause much harm, other than the fact that the
lifetime of the expando does not match the lifetime of the underlying
binary data. But this is already true for typed arrays, in a sense.

I think the best way to address the confusion of expandos on interior
arrays is simply non-extensibility, as has been discussed. I don't see why
non-extensibility for interior arrays requires crippling the functionality
of typed arrays in general, since JS already seems to have 2-3 exposed
concepts in this field (seal, freeze, preventExtensions) along with query
methods to find out if those concepts apply to a given object (isSealed,
isFrozen, isExtensible)

If interior arrays are not extensible, I should hope that
Object.isExtensible for them returns false. If it were to return true when
they have no expando support that would be incredibly confusing.

Anyway, given all this I would propose that the optimal solution (in terms
of usability, at least - can't speak for the performance consequences) is
for typed arrays to be extensible by default, as they are Objects that
point to underlying sequences of elements, just like Array. This gives good
symmetry and lets you cleanly substitute a typed array for an Array in more
cases (resizability and mixed types being the big remaining differences).
In cases where extensibility is a trap for the unwary or actively
undesirable, like interior objects, the instance should be made
non-extensible. This allows all end user code to handle cases where it is
passed an interior array or object without reducing the usefulness of typed
arrays.

FWIW I would also argue that a free-standing instance of any Binary Data
type (that you construct with new, not using an existing buffer) should
maybe be extensible by default as well, even if 'interior' instances are
not. However, making binary data types always non-extensible wouldn't
exactly break any compatibility or use cases, since they're a new feature -
but it does mean now we have to add checks for extensibility/typeof in more
cases, which is awful...

(A related area where this is a big problem for me and authors of similar
packages is emulating the java/C# 'getHashCode' pattern, where objects all
have an associated static hash code. Implementing this often requires
attaching the computed hash to the object as an expando or via some other
association like WeakMap. I think interior objects in binary data break
this fundamentally, which is painful.)


On Wed, Sep 4, 2013 at 7:29 PM, Filip Pizlo <fpizlo at apple.com> wrote:

>
> On Sep 4, 2013, at 5:25 PM, Brendan Eich <brendan at mozilla.com> wrote:
>
> Filip Pizlo wrote:
>
> Typed arrays have both of these properties right now and so expandability
> is a free lunch.
>
>
> The last sentence makes a "for-all" assertion I don't think
> implementations must be constrained by.
>
>
> How so? It is true that some VM implementations will be better than
> others. But ultimately every VM can implement every optimization that every
> other VM has; in fact my impression is that this is exactly what is
> happening as we speak.
>
>
> My "for-all" referred to all typed arrays across all VMs, not just all VMs.
>
> Also just as a point of fact (something "done", the Latin root means
> "deed"), I do not see the same optimizations being used in all VMs. For
> example, SpiderMonkey's TI (written up here: http://
> rfrn.org/~shu/drafts/ti.pdf‎ <http://rfrn.org/~shu/drafts/ti.pdf%E2%80%8E>
>  [PLDI 2012]) is not being used elsewhere AFAIK -- please correct me if
> I'm mistaken.
>
>
> Interesting point.  Equivalent optimizations are being done.  Other VMs
> also infer types one way or another.  And I'd argue that my way of
> inferring types is the best - it incurs smaller overheads for start-up
> while achieving more precise results.  (Of course I must say that - I stand
> by my stuff, heh.)  That being said, I do think that FF's TI is really cool
> and loved reading that paper.
>
> It's kind of like in JVMs, all of the big-league ones did speculative
> inlining - but they do it in radically different ways and rely on different
> kinds of feedback and if you go to a conference where JVM hackers show up,
> they will argue about which is best.  I have fond memories of Sun vs. IBM
> vs. Oracle shouting matches about how you do deoptimization, whether you do
> deoptimization at all, and what you need to analyze and prove things about
> the class hierarchy.  That doesn't change the basics: they all do
> speculative inlining and it performs sort of the same in the end.
>
> I suspect that the same thing is becoming true of typed arrays, regardless
> of whether they are extensible or not.  I guess that when I said "every
> optimization that every other VM has" I didn't mean literally using the
> same exact algorithm - just performing optimizations that achieve
> equivalent results.
>
>
> So, it doesn't make much sense to make language design decisions because
> it might make some implementor's life easier right now. If you could argue
> that something will /never/ be efficient if we add feature X, then that
> might be an interesting argument. But as soon as we identify one sensible
> optimization strategy for making something free, I would tend to think that
> this is sufficient to conclude that the feature is free and there is no
> need to constrain it. If we don't do this then we risk adding cargo-cult
> performance features that rapidly become obsolete.
>
>
> I agree that's a risk. I'm also with Niko in wanting to argue about what
> the semantics should be without appealing to performance arguments.
>
>
> Right!  I guess my first order argument is that performance *isn't* an
> argument in favor of non-expandability.
>
>
> However, I still think you are verging on promising a free lunch. All
> methods in C++ cannot affordably be virtual. Expandos in JS cost. At fine
> enough grain, even pretty-well-predicted branches cost. Nothing is
> free-enough to discount forever in my bitter and long experience :-P.
>
>
> I am promising a free lunch!  Virtual methods in C++ are only expensive
> because C++ still doesn't have feedback-driven optimization.  JVMs make
> them free in Java.  And they are free.  Period.  There is no upside to
> marking a method final in Java.  I am arguing that expandos are similar.
>
>
> The lack of static types in JS does not mean exactly one implementation
> representation must serve for all instances of a given JS-level
> abstraction. We already have strings optimized variously in the top VMs,
> including Chords or Ropes, dependent strings, different character sets, etc.
>
>
> Still find this discussion amusing? Here's the long story is: It is these
> things that I list above that lead to a 16 byte overhead on 32-bit, and a
> 32-byte overhead on 64-bit in the best "sane" case. Giving typed array
> objects expandability doesn't add to this overhead, because two of the
> fields necessary to implement the above (the type, and the buffer) can be
> displaced for pointing to property storage. Any imaginable attempt to
> reduce the overhead incurred by the information - using BBOP (big bag of
> pages) for the type, using an out-of-line weak map for the buffer or the
> type, encoding some of the bits inside the pointer to the typed array, etc.
> - can be also used to eradicate any space overhead you'd need for custom
> properties, so long as you're on board with the "free if unused,
> sub-optimal if you use them" philosophy.
>
>
> For something like decimal, it matters whether there's an empty side table
> and large-N decimal instances of total size N*S, vs. N*(S+K) for some
> constant K we could eliminate by specializing harder. Even better if we
> agree that decimal instances should be non-extensible (and have value not
> reference semantics -- more below).
>
>
> With a side table, the constant K = 0 even if you have custom properties.
> The table will only have an entry for those instances that had custom
> properties.
>
>
> I know, that's why I was attacking the non-side-table approach.
>
> But the side table has its own down-side trade-offs: GC complexity, even
> costlier indirection, and strictly greater implementation complexity. If
> one could implement without having to mess with this K ?= 0 design decision
> and hassle with packing or else using a side-table, one's VM would be
> smaller, simpler, less buggy -- all else equal.
>
>
> Meh, I'm just reusing the GC complexity that the DOM already introduces.
>
>
> Now you may say that I'm betraying my hero Mr. Spock, whom I have invoked
> to argue that implementors should sacrifice so the mass of JS users can
> live long and prosper.
>
>
> Yes, you are. ;-)
>
>
> And you'd have me dead to rights -- if I thought JS users wanted expandos
> on binary data, that the lack of expandos there was a problem akin to the
> whole starship being blown up. But I do not believe that's the case.
>
> If users don't care, then implementors should get a break and VMs should
> be simpler, ceteris paribus.
>
>
> Fair enough.
>
>
> - If the VM wants to go further and create immediate representations of
> some or all Int64's, similarly to what VMs do for JS small integers today,
> then the main problem you run into is object identity: does
> Int64(1).add(Int64(1)) == Int64(1).add(Int64(1))? A naive JS implementation
> of an Int64 class would say that this is false, since it's likely to
> allocate a new Int64 each time. But an immediate representation would have
> no choice but to say true. You can work around this if you say that the
> VM's implementation of Int64 operations behaves /as if/ the
> add()/sub()/whatever() methods used a singleton cache. You can still then
> have custom properties; i.e. you could do Int64(2).foo = 42 and then
> Int64(1).add(Int64(1)).foo will return 42, since the VM can keep an
> immediate-int64-to-customproperties map on the side. That's kind of
> analogous to how you could put a setter on field '2' of Array.prototype and
> do some really hilarious things.
>
>
> The value objects proposal for ES7 is live, I'm championing it. It does
> not use (double-dispatch for dyadic) operators as methods. It does not use
> extensible objects.
>
> http://wiki.ecmascript.org/doku.php?id=strawman:value_objects
> http://www.slideshare.net/BrendanEich/value-objects
>
> Warning: both are slightly out of date, I'll be updating the strawman over
> the next week.
>
>
> Thanks for the links! To clarify, I'm not trying to make a counterproposal
> - the above was nothing more than a fun thought experiment and I shared it
> to motivate why I think that custom properties are free.
>
> My understanding is that you are still arguing that custom properties are
> not free, and that they incur some tangible cost in terms of space and/or
> time. I'm just trying to show you why they don't if you do the same
> optimizations for them that have become acceptable for a lot of other JS
> corners. Unless you think that ES should have an "ease of implementation"
> bar for features. I wouldn't necessarily mind that, but my impression is
> that this is not the case.
>
>
> I do think implementor ease, or really implementation simplicity, should
> be a concern. It's secondary, per Spock's Kobayashi Maru solution, to the
> needs of the many JS users. But it's not nothing. Part of the impetus for
> Dart, I'm told, is the complexity of V8 required by JS-as-it-is. Whatever
> the case there, standardized JS extensions should not add too much
> complexity if we can help it.
>
> I'll lay off performance concerns but you'll still see me, like Ahab
> lashed to the white whale, beckoning against free lunch arguments or
> anything near them :-P.
>
>
> My job is to give people a free lunch in the performance department.  So I
> live by free lunch arguments.
>
>
> With value objects, TC39 has definitely favored something that I think you
> oppose, namely extending JS to have (more) objects with value not reference
> semantics, which requires non-extensibility.
>
>
> Indeed.
>
>
> If I have followed your messages correctly, this is because you think
> non-extensibility is a rare case that should not proliferate.
>
>
> I have two points here:
>
> - Typed arrays already have so much observable objectyness that making
> then non-extensible feels arbitrary; this is true regardless of the
> prevalence, or lack thereof, of non-extensibility.
>
>
> Ok, I acknowledge this point.
>
> And yet SpiderMonkey had native typed arrays from the get-go,
> non-extensible -- we didn't use WebIDL. So the interoperable intersection
> semantics developers can count on does not include extensibility. As Mark
> says, this allows us to standardize either way, so we need arguments that
> don't appeal to "feelings".
>
>
> This is a good point.
>
>
> - At the same time, I do think that non-extensibiltiy is a rare case and I
> don't like it.
>
>
> I can tell ;-). Feelings are important but to decide on a spec we will
> need stronger reasons.
>
>
> I agree.  I'm assuming that in the grand scheme of things, specs improve
> when people articulate gut feelings and we reach something in the middle.
>
>
> But with ES5 Object.preventExtensions, etc., the horse is out of the barn.
>
>
> It's there and we have to support it, and the fact that you can do
> preventExtensions() to an object is a good thing. That doesn't mean it
> should become the cornerstone for every new feature. If a user wants to
> preventExtensions() on their object, then that's totally cool - and I'm not
> arguing that it isn't.
>
> The argument I'm making is a different one: should an object be
> non-expandable by default?
>
> I keep hearing arguments that this somehow makes typed arrays more
> efficient. That's like arguing that there exists a C compiler, somewhere,
> that becomes more efficient if you label your variables as 'register'.
>
>
> I remember when that indeed mattered.
>
> It's true that if you're missing the well-known optimization of register
> allocation then yes, 'register' is an optimization. Likewise, if you're
> missing the well-known object model optimizations like pointer
> displacement, BBOP's, or other kinds of side tables, then forcing objects
> to be non-extensible is also an optimization. That doesn't mean that we
> should bake it into the language. VM hackers can just implement these
> well-known optimizations and just deal with it.
>
>
> Ok, let's let the performance argument rest. You can be Ishmael and live.
> I'm Ahab and I still stab at such nearly-free-lunch, "sufficiently smart
> compiler" claims :-).
>
> At a deeper level, the primitives wired into the language, boolean number
> string -- in particular number when considering int64, bignum, etc. -- can
> be rationalized as value objects provided we make typeof work as people
> want (and work so as to uphold a == b && typeof a == typeof b <=> a === b).
>
>
> I think making int64/bignum be primitives is fine. My only point is that
> whether or not you make them expandable has got nothing to do with how much
> memory they use.
>
>
> This seems more winning in how it unifies concepts and empowers users to
> make more value objects, than the alternative of saying "the primitives are
> legacy, everything else has reference semantics" and turning a blind eye,
> or directing harsh and probably ineffective deprecating words, to
> Object.preventExtensions.
>
>
> Well this is all subjective. Objects being expandable by default is a
> unifying concept.
>
>
> It does not unify number, boolean, string.
>
>
> True.
>
>
> What's not subjective is that we have two concepts in JS today, one
> (ignoring null and undefined) for primitive AKA value types, the other for
> reference types (objects). I see a way to extend object as a concept to
> subsume value types, although of course unity comes at the price of
> complexity for object. But non-extensibility is a piece of complexity
> already added to object as a concept by ES5.
>
> Irreducible complexity here, and perhaps "subjective" or (I prefer)
> "aesthetic" judgment is the only way to pick.
>
>
> Is it clear that we *can't* have a better story for value types?  I just
> don't think that non-extensibility is sufficient.
>
> OK so lets back up.  Do you believe that making an object non-extensible
> is sufficient to make it a "value type"?  I don't.  You probably need some
> other stuff.
>
> This is where I return to the objectyness point: typed arrays are already
> spec'd to have a bunch of heavy reference-to-object behavior.  So making
> then expandable is no big deal.  And making then non-expandable means that
> we'll now live in a weirdo world where we have four different concepts of
> what it means to be a value:
>
> A) Full blown reference objects that you can do weird things to, like add
> properties and change __proto__, etc.  You can also make one non-extensible
> at your discretion, which fits into the bat-poop crazy "you can do
> anything" philosophy of full blown objects.  And that's great - that's the
> heart of the language, and I happen to enjoy it.
>
> B) Object types that are always non-extensible but otherwise still objecty
> - they have a prototype that is observable, they reveal their identity via
> ==, and you can actually inject stuff into them by modifying the
> appropriate Object.prototype.
>
> C) Values with whatever value type semantics we come up with in the future.
>
> D) Primitives.
>
> Now, I hope that we could get C and D to be as close as possible to each
> other.  But that still leaves three different behaviors.  This introduces a
> learning curve.  That's why (B) offends me.  It's subtly different from (A)
> and clearly different from either (C) or (D).
>
> Now, we actually also have a totally alternate behavior, used by binary
> data.  And my argument there is that I wouldn't get too offended by binary
> data acting weird, because the very notion of exposing binary data is weird
> to begin with.  I expect it to be used only for special graphicsy stuff and
> not for general-purpose "value types" for normal JS programs.  So it's OK
> to me if binary data is both weird and inconsistent with everything else.
>  And no, I still don't view "typed arrays" as being part of binary data -
> it already appears to be the case that typed arrays have different buffer
> behavior to the struct types.  So they're just different.  And that's fine.
>
>
> The only thing that expandability of typed arrays appears to change is the
> interaction with binary data - but that isn't exactly a value object system
> as much as it is a playing-with-bits system. I'm not sure that having
> oddities there changes much.
>
>
> Sure, let's get back to binary data (I brought up value objects because
> you brought up int64).
>
> Interior binary data objects will be cons'ed up upon extraction, so
> distinguishable by == returning false and by lack of expando preservation.
> Niko, Dmitry, and others take this as a sign that expandos should not be
> allowed, leaving only == returning false among same-named extractions as an
> oddity. And they further conclude that expandos should not be allowed on
> any binary data object (whether interior extracted, or not).
>
> You argue on the contrary that JS objects in general can be extended with
> expandos, so why restrict binary data objects, even interior ones that are
> extracted? Let each such extracted interior object be != with all other
> same-named extractions, and let each have expandos assigned that
> (vacuously) won't be preserved on next extraction.
>
>
> I hope I have stated positions accurately.
>
>
> Yup!
>
> If so I'll tag out of the ring, in hopes of someone else bringing new
> arguments to bear.
>
> /be
>
>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130904/13c7e306/attachment-0001.html>


More information about the es-discuss mailing list