What is an array?

David Bruant david.bruant at labri.fr
Thu Mar 31 09:04:21 PDT 2011


Le 31/03/2011 01:37, Allen Wirfs-Brock a écrit :
> David,
>
> That wiki page and the working document it links to was intended to
> stir up exactly these sorts of questions.  There are recurring
> discussions about various situations where it may be useful to
> categorize  ECMAScript objects and how to go about creating objects
> that fit into various categories.  Proxies exasperates these issues
> because they permit anyone to fairly easily creates new kinds of
> objects with extraordinary behaviors.  Rather than just adding new ad
> hoc categorization tests or  additional operations to manufacture
> specific "types" of objects I think we need to see if we can first
> establish some sort of principled foundation for doing so.   I also
> think it is important that we try to do this before Proxies are
> completely frozen as we need to make sure that they can support what
> we come up with.
I could not agree more. This is partially the reason why I experiment
with proxies and try to push the boundaries as far as I can.
Actually, one next step I was planning was to start a thread "what is an
object?", bring proxies on the table and ask what we didn't need to ask
ouselves before they were mentionned.
Is there already a deadline for when proxies have to/should/may be frozen?

> On Mar 30, 2011, at 3:30 AM, David Bruant wrote:
>
> (...)
> The ES definition of Array.isArray is an example of this.  It uses a
> [[Class]] test because it wanted array objects imported from different
> top level global contexts to pass the test, even if their
> [[Prototype]] value was different from that used in the current
> context.  Because the ES5 spec. doesn't even have the concept of the
> multiple top level contexts so there was no way to just say it.
That's very interesting. Maybe it should be noted somewhere on
strawman:multiple_globals ?
Since the point has been raised several times (and is actually in web
browsers since the beginning), it might be worth starting to talk about
the matter in the spec, maybe as notes. These notes could be gathered as
an Annex as they are for strict mode.


> [[Class]]=="Array" was picked as a testable characteristic  that
> should be context independent but the [[DefineOwnProperty]] definition
> would have also been just as good a test and arguably the test should
> be for all internal characteristics that define Array-ness. 
>
> Using a single characteristic test (such as [[Class]]=="Array") to
> also imply other uniquely associated characteristics is fine in a
> closed, self-contained specification like ES5 pretends to be.  But in
> reality, the ES5 spec. is just the foundation spec. of an extension
> system (for example, via host objects) and as soon as you start
> defining  extensions you can't any longer assume unique associations
> of disjoint characteristics.
Actually, there are some "protective" mechanisms enforced by the spec
that allow you to assume such associations. For instance, host objects
aren't allowed to set [[Class]] to "Array". Relying on this property,
the spec can decide to only rely on [[Class]] to define what an Array
instance is or is not and make associations between [[Class]] and other
characteristics.
[[Class]] doesn't seem really right, but this "exclusion" mechanism
could be reused to assume unique associations of disjoint characteristics.



>> One thing that caracterizes arrays is the internal connexion between
>> "length" and numerical properties. This link can be found in the
>> specific [[DefineOwnProperty]]. But I think it may be too quick to
>> conclude that Arrays are the objects with this particular internal
>> function. One could imagine that one day, other objects have this
>> particular [[DefineOwnProperty]] internal methods, but other [[Get]]
>> or [[Set]] or [[HasProperty]] internal methods. In my opinion, one
>> thing that makes an Array what it is is all internal methods and the
>> internal consistency they create altogether. It turns out they are
>> all the same than the Object ones and only [[DefineOwnProperty]] is
>> different, but still, the internal consistency we rely on when using
>> Arrays exists because of the conjunction of all internal properties.
>
> I agree.  It is the complete set of internal methods supported by an
> object that characterize its language-level behavior.
>  In https://spreadsheets.google.com/ccc?key=0Ak51JfLL8QLYdFRCOXBRczJfRzNJSEk2eXptQ3BzalE&hl=en
> <https://spreadsheets.google.com/ccc?key=0Ak51JfLL8QLYdFRCOXBRczJfRzNJSEk2eXptQ3BzalE&hl=en> I've
> tried to categorized all objects specified by ES5 in this manner. I
> identified 14 "classes" of ES5 objects.
>
> One direction I'm thinking might be appropriate is to formalize these
> internal method groups, name each one, and specify a contract for
> each.   Since the internal methods (and their mapping to Proxy traps)
> are essentially the ES internal meta-object protocol, these named
> contracts would essentially classify all built-in object primitive
> behavioral variations and might be useful to reflect to the user level
> for identify such primitive behavioral variations.  Since Proxies have
> the power to define new variations you would expect a Proxy writer to
> either re-implement an existing contract or to define a new contract.
>  In either case, the Proxy instances could get tagged according to
> their contract.
The notion of "new contract" you mention is interesting. Following a
different path, I have reached that same notion when discussing the
proxy open issue ("How to deal with inconsistent data returned by
handler traps?"
https://mail.mozilla.org/pipermail/es-discuss/2011-March/013117.html).
Tell me if I misunderstand you, but the way I understand what you said
above ("Rather than just adding new ad hoc categorization tests or
 additional operations to manufacture specific 'types' of objects I
think we need to see if we can first establish some sort of principled
foundation for doing so."), is to try to find what are the correct
potential and limitations proxies should allow in these "new contracts".

For instance, one of such issue is "should we enforce proxy property
descriptors to look like current property descriptors ('enumerable',
'configurable'...) on the defineProperty and get{Own}PropertyDescriptor
traps?". Is this the kind of issue you'd consider in the "principled
foundations" you mentioned?



>> But recently, I have created the ProxyArray library
>> (https://github.com/DavidBruant/ProxyArray) where, on a proxy, I have
>> recreated the internal array behavior and special relationship
>> between 'length' and numerical indices. Leaving aside the fact that
>> no ES engine could determine whether a proxy handler has the exact
>> same semantics than actual array internal methods, my library doesn't
>> create Arrays anyway. One of the reason is that, for instance, even
>> if the engine could tell that the semantics is the same, it would
>> certainly not apply Array-specific optimizations.
>
> I don't think such optimizations should be brought into this
> discussion.  Not all implementations optimize Arrays.  Some
> implementations optimize array-index properties on all objects, not
> just Array objects.  Regardless, such optimizations are supposed to be
> transparent (other than for performance impacts) and you can't depend
> upon optimizations being the same when you move between
> implementations or even versions of the same implementation.
I didn't mean to talk about optimisations just for themself, but rather
to illustrate the fact that even though some proxies may have
semantically equivalent internal methods, engine may not consider them
as such anyway (if they could figure out the equivalence, which they
can't). One of the consequences would be the unability to provide
array-like optimizations. The point of the illustration was to say that
there is something different than just the internal semantics that
define what an Array is. I may be wrong.

> Other than that, your ProxyArray looks like a fine implementation of
> what the spec. expects except for two things 1) you can't set
> [[Class]]==Array which isn't your fault, and 2) you should be setting
> the [[Prototype]] of your objects via Object.getPrototypeOf([ ])
> rather than Array.prototype because Array may not be its original
> built-in value (of course you would also need to cache
> Object.getPrototypeOf so it amounts to about the same thing.)
You're perfectly right on both points. There are also a couple of other
issues like the fact that I have not implemented ToUInt32 and other
minor issues like that. The main point of the experiments was to prove
that you can build something that is equivalent to Array instances from
the semantics point of view but that the engine (or even the spec if it
was specify as such) would not consider as an Array instance.


>> As a consequence, it englobes things created after a call to
>> "Array()" (ES5 15.4.1.1 - Create and return a new Array object
>> exactly as if the standard built-in constructor Array was used in a
>> new expression with the same arguments (15.4.2).) or array literals
>> (ES5 11.1.4 Array Initialiser - semantics Step 1 of "ArrayLiteral : [
>> Elisionopt ]" and "ElementList : Elisionopt AssignmentExpression").
>>
>> In my opinion, at creation (direct or indirect "new Array()"), arrays
>> should internally contain a marker saying they've being created with
>> the built-in Array constructor. I think that the intention behind
>> [[Class]] was to have such a marker. But in a way, it's "just a
>> string" and we could imagine a world where the [[Class]] was settable
>> on some objects. Nevertheless even if setting [[Class]] to arrays it
>> wouldn't make them Arrays just because some string has the right value.
>
> I'm leaning in a similar direction, but I don't think it should be
> tried to a specific constructor.  That isn't extensible enough.
> Actually, I think there is a potential for two orthogonal markers.
>  One corresponds to the internal MOP contract I discussed it above.
>  It defines how the object supports various language operations such
> as property access.  The other is the internal representation marker
> used to match built-in methods to internal object representations.  I
> think the MOP contract is probably useful to expose.  I'm less sure
> about the other as presumably only an implementation can define
> built-in methods and alternative representations at that level.  If
> you need to do something comparable at the Proxy level you could
> simply invent you own internal (to the handler) tagging scheme to make
> sure your methods were only applied to the right kind of Proxy object.
If I follow you, the internal [[Class]] was actually an attempt to unify
the two markers you describe, wasn't it?


>>
>> Earlier, I said that it was a partial conclusion. I would like to
>> addArray subclassesto the definition. The "Standardizing __proto__"
>> thread led to the idea of standardizing a subclassing mechanism. In
>> my opinion, objects constructed as subclassed arrays should benefit
>> from the Array identity (return true to "Array.isArray") and as a
>> consequence benefit from Array extras natively (in ES5, applying
>> Array extras to non-Array objects is implementation-specific).
>
> You're mistaken about Array extras and non-Array objects.  In ES5 all
> Array.prototype methods are generic across all native objects.  The
> implementation dependency concerns host objects.
You're right. And it's not the first time I make this confusion :-p

> I'm pretty sure that  when people ask for "Array subclass" they are
> really asking for objects that support the array instance internal mop
> contract.
I fully agree with you with this functional aspect.
Even if slightly irrelevant to the specification, but relevant to their
implementors, i'd like to point out that I think that when people ask
for Array subclass, they also ask for internal array optimizations,
which means same array internal representations.


> Something you didn't mention is most of the line items listed
> in http://wiki.ecmascript.org/doku.php?id=strawman:es5_internal_nominal_typing .
>  I content that Array.isArray (or [[Class]]=="Array") is in most cases
> a over specification where what is really desired a a more limited
> test for some finer grained characteristic.
I actually didn't make the connection :-)


> This over specification limits extensibility.  For example, String
> concat uses an array Class
Do you mean Array.prototype.concat (step 5.b)?

> check to determine whether an argument should be concatenated as a
> single object or whether it should be exploded into constituent
> values. Why is this necessarily tied to array-ness?  Why can a user
> define a collection abstraction that explodes when concatenated?
(there is a typo in the strawman. You wrote "Function.prototype.concat")
I think you're right. However, I think that at the time the function was
written, there was no way to internally make the difference between a
collection abstraction and an object non used as a collection
abstraction. Maybe that solving in the spec the subclassing problem will
be an occasion to make this distinction?


> Thanks, for thinking about this stuff.  I think is useful to get
> feedback in this area.
Always a pleasure :-)

David

> Allen

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110331/29ddedb6/attachment-0001.html>


More information about the es-discuss mailing list