What is an array?

Allen Wirfs-Brock allen at wirfs-brock.com
Thu Mar 31 11:02:02 PDT 2011


On Mar 31, 2011, at 9:04 AM, David Bruant wrote:

> Le 31/03/2011 01:37, Allen Wirfs-Brock a écrit :
>> 
>> 
> Is there already a deadline for when proxies have to/should/may be frozen?

Not clear yet.  We probably have about a year to polish them in TC39. But, sooner is better, especially if we expect to see other early browser implementations beyond Mozilla'a current experiment.

> 
>> On Mar 30, 2011, at 3:30 AM, David Bruant wrote:
>> 
>> (...)
>> ...
>> Using a single characteristic test (such as [[Class]]=="Array") to also imply other uniquely associated characteristics is fine in a closed, self-contained specification like ES5 pretends to be.  But in reality, the ES5 spec. is just the foundation spec. of an extension system (for example, via host objects) and as soon as you start defining  extensions you can't any longer assume unique associations of disjoint characteristics.
> Actually, there are some "protective" mechanisms enforced by the spec that allow you to assume such associations. For instance, host objects aren't allowed to set [[Class]] to "Array". Relying on this property, the spec can decide to only rely on [[Class]] to define what an Array instance is or is not and make associations between [[Class]] and other characteristics.
> [[Class]] doesn't seem really right, but this "exclusion" mechanism could be reused to assume unique associations of disjoint characteristics.

The prohibition of host objects reassigning specified [[Class]] values is new in ES5.  Prior to that, host objects were allowed to arbitrary use class values such as "Array".  That change was made exactly for the reason you note.  However, the uses of [[Class]]=="Array" in the spec. are still too general.  It is a good idea to prevent a host object from lying about representation invariants that the runtime depends upon.  However, I don't see why a host object should be blocks from defining an object that JSON.stringify will output using "[ ]" notation.

> 
> For instance, one of such issue is "should we enforce proxy property descriptors to look like current property descriptors ('enumerable', 'configurable'...) on the defineProperty and get{Own}PropertyDescriptor traps?". Is this the kind of issue you'd consider in the "principled foundations" you mentioned?
> 

Yes these are the sort of issues we should be looking at.  Also this probably closely related to things that might appear in your hypothetical "What is an object" thread.


> 
> 
>>> But recently, I have created the ProxyArray library (https://github.com/DavidBruant/ProxyArray) where, on a proxy, I have recreated the internal array behavior and special relationship between 'length' and numerical indices. Leaving aside the fact that no ES engine could determine whether a proxy handler has the exact same semantics than actual array internal methods, my library doesn't create Arrays anyway. One of the reason is that, for instance, even if the engine could tell that the semantics is the same, it would certainly not apply Array-specific optimizations.
>> 
>> I don't think such optimizations should be brought into this discussion.  Not all implementations optimize Arrays.  Some implementations optimize array-index properties on all objects, not just Array objects.  Regardless, such optimizations are supposed to be transparent (other than for performance impacts) and you can't depend upon optimizations being the same when you move between implementations or even versions of the same implementation.
> I didn't mean to talk about optimisations just for themself, but rather to illustrate the fact that even though some proxies may have semantically equivalent internal methods, engine may not consider them as such anyway (if they could figure out the equivalence, which they can't). One of the consequences would be the unability to provide array-like optimizations. The point of the illustration was to say that there is something different than just the internal semantics that define what an Array is. I may be wrong.

I think that if you decide to use a MOP-level extension mechanism (ie, Proxy) then you are also accepting that there probably will be some performance hit relative to an equivalent native implementation.  But presumably the reason you need to define something at that level is because there isn't a native implementation that meets your requirements.  In a real application, somebody shouldn't be re-implement Array using proxies just because it was possible.

Array "subclassing" may be a different matter that probably shouldn't require a MOP-level extension, but here we need to first resolve which Array characteristics are actually desired for such "subclassing".

Finally, I still think you are making incorrect generalizations about how various implementations optimize arrays.

Regardless, we need to be careful about making language design decisions based upon 2011 optimization practices.   There is a virtuous circle between what language features developers use and what features language implementers optimize.  Over time, the possible optimization that get implemented tend to be those that will have a positive effect of real applications.  If developers never use a feature it will probably never get optimized (unless it is used in a benchmark...).  Occasionally, developer need to be brave and starting using new language features in order to show that the features are worth optimizing.

> 
>> Other than that, your ProxyArray looks like a fine implementation of what the spec. expects except for two things 1) you can't set [[Class]]==Array which isn't your fault, and 2) you should be setting the [[Prototype]] of your objects via Object.getPrototypeOf([ ]) rather than Array.prototype because Array may not be its original built-in value (of course you would also need to cache Object.getPrototypeOf so it amounts to about the same thing.)
> You're perfectly right on both points. There are also a couple of other issues like the fact that I have not implemented ToUInt32 and other minor issues like that. The main point of the experiments was to prove that you can build something that is equivalent to Array instances from the semantics point of view but that the engine (or even the spec if it was specify as such) would not consider as an Array instance.
> 
>>> ...
>>> In my opinion, at creation (direct or indirect "new Array()"), arrays should internally contain a marker saying they've being created with the built-in Array constructor. I think that the intention behind [[Class]] was to have such a marker. But in a way, it's "just a string" and we could imagine a world where the [[Class]] was settable on some objects. Nevertheless even if setting [[Class]] to arrays it wouldn't make them Arrays just because some string has the right value.
>> 
>> I'm leaning in a similar direction, but I don't think it should be tried to a specific constructor.  That isn't extensible enough. Actually, I think there is a potential for two orthogonal markers.  One corresponds to the internal MOP contract I discussed it above.  It defines how the object supports various language operations such as property access.  The other is the internal representation marker used to match built-in methods to internal object representations.  I think the MOP contract is probably useful to expose.  I'm less sure about the other as presumably only an implementation can define built-in methods and alternative representations at that level.  If you need to do something comparable at the Proxy level you could simply invent you own internal (to the handler) tagging scheme to make sure your methods were only applied to the right kind of Proxy object.
> If I follow you, the internal [[Class]] was actually an attempt to unify the two markers you describe, wasn't it?

I think that is largely true, particularly if you go back and look at the ES1 spec.  However, even in it there were some spec. bugs  that related to this sort of classification problem.   For example, there are built-in methods that require their this object to have a specific [[Class]] value, presumably to ensure availability of certain internal state.  Yet, ES1 allowed host objects to claim the same [[Class]] value without necessarily providing that state.

Looking at ES1 and then at ES2-5, it appears that most of the higher  level uses of [[Class]] (eg, [].concat, JSON.stringify, etc.) were added in latter editions and probably would have been better to accomplish without reference to [[Class]].

> 
>> This over specification limits extensibility.  For example, String concat uses an array Class
> Do you mean Array.prototype.concat (step 5.b)?
yes, sorry.
> 
>> check to determine whether an argument should be concatenated as a single object or whether it should be exploded into constituent values. Why is this necessarily tied to array-ness?  Why can a user define a collection abstraction that explodes when concatenated?
> (there is a typo in the strawman. You wrote "Function.prototype.concat")
fixed
> I think you're right. However, I think that at the time the function was written, there was no way to internally make the difference between a collection abstraction and an object non used as a collection abstraction. Maybe that solving in the spec the subclassing problem will be an occasion to make this distinction?

Except, I don't think this is a subclassing issue.  Instead, it is a matter of how situationally behaviorally polymorphic objects are identified.  I don't think there is a general solution to this problem, but instead something that needs to be addressed in the context of specific use cases. 

Allen


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110331/4ee54b91/attachment-0001.html>


More information about the es-discuss mailing list