possible excessive proxy invariants for Object.keys/etc??

Allen Wirfs-Brock allen at wirfs-brock.com
Tue Nov 20 16:06:27 PST 2012


and my first reply, direct to es-discuss...

On Nov 20, 2012, at 2:32 PM, Allen Wirfs-Brock wrote:

> 
> 
> Begin forwarded message:
> 
>> From: Tom Van Cutsem <tomvc.be at gmail.com>
>> Date: November 20, 2012 11:36:24 AM PST
>> To: Allen Wirfs-Brock <allen at wirfs-brock.com>
>> Cc: "Mark S. Miller" <erights at google.com>, Jason Orendorff <jorendorff at mozilla.com>
>> Subject: Re: possible excessive proxy invariants for Object.keys/etc??
>> 
>> 2012/11/19 Allen Wirfs-Brock <allen at wirfs-brock.com>
>> On Nov 19, 2012, at 10:04 AM, Tom Van Cutsem wrote:
>>> The copying is to ensure:
>>> a) that the result is an Array
>> 
>> Why is Array-ness essential?  It was what ES5 specified, but ES5 had a fixed implementations of Object.getOwnPropertyNames/keys so that requirement was simply a reflection of what the  spec's algorithms actually did.  Most places in the specification that consume an "array" only require array-like-ness, not an actual array.  Now that we're make the implementation of the getOPN/keys extensible, it would be natural to relax the requirement that they produce a  [[Class]]==="Array" object.
>> 
>> It's all a matter of developer expectations and how much leeway we have in changing the "return type" of existing built-ins.
>> In theory, it's a backwards-incompatible change. In practice, it may not matter.

It can't break existing code that doesn't use Proxies. Plus,  the default behavior is still to return a built-in Array instance.  For existing code to break it would have to:
1) be used in an environment where proxies existed and were being used
2) it would have to actually be reflecting upon such proxies
3) The proxy would have to return something other than a built-in array instance from the getOwnPropertyNames/keys trap.
4)  The code would actually have to be dependent upon the built-in array-ness (Array.isArray check, dependency upon the length invariant, etc.)  of such such a returned object.

It all seems quite unlikely to cause any compat issues for existing  and probably worth the risk.  Particularly if the alternative is to add an otherwise unnecessary array copying to every such proxy trap call. 

>>> b) that all the elements of the result are Strings
>> 
>> And presumably Symbols.  We have to also accommodate Symbols, at least for getOwnPropetyNames.  Regardless, why is this important?  More below...
>> 
>> Same argument as above.
>> 
>> I recall there was some concern about symbols showing up in existing reflection methods like Object.getOwnPropertyNames. Not sure how that got resolved. It's basically touching upon the same issue of whether we can change the return type of existing methods.

I'm assuming we are excluding symbols from [[Enumerate]] and [[Keys]] so it is only [[GetOwnPropertyNames]] that have this concern.

One way or another we have to be able to reflectively get a list of symbol keyed properties.

We can either:
1) include them in the Object.getOwnPropertyNames result.
2) Consider getOwnPropertyNames depreciated (it actually lives forever) and replace it with a new Object.getOwnPropertyKeys that includes non-private Symbol keys
3) Keep getOwnPropertyNames as is and add Object.getOwnPropertySymbols that only returns the non-private-Symbol keys.

1) has the greatest backward compat concerns, but is the simplest to provide and for ES programmers to deal with going forward.
2) Eliminates the compat risk but creates perpetual potential for a: used gOPNames when I should have used gOPK  hazard.
3) Eliminates the compat risk but means everybody who wants to deal with all own properties have to deal with two lists of keys.  They also loose relative insertion order info relating key/symbol property keys.

Either 2 or 3 could require widening the MOP which increases the possibility for incomplete, inconsistent, or otherwise buggy proxies.  Even if we expose multiple public functions like either 2 or 3 I would still prefer to only have a single getOwnProperyyKeys internal method/trap that can be filtered to provide just strings and/or symbols.  Overall, I'm more concerned about keeping the internal method/trap MOP width as narrow as possible than I am about the size of the public Object.* API

>>  
>> 
>>> c) to ensure the stability of the result.
>>> 
>>> You can think of a + b as implementing a type coercion of the trap result to "Array of String". This coercion is not too dissimilar from what the getOwnPropertyDescriptor has to do (normalization of the returned property descriptor by creating a fresh copy).
>> 
>> Yes, premature type coercion, in my opinion. Also, a classic performance mistakes made by dynamic language programmers:  unnecessary coercion of values that are never going to be access or redundant coercion checks of values that are already of the property type.  Why is it important to do such checks on values that are are just passing through these traps.  When and if somebody actually gets arounds to using one of the elements that are returned as a property key they will be automatically coerced to a valid value.  Why is it important that it happens any sooner than that.
>> 
>> I don't know if you are familiar with the work of Felleisen et al. on higher-order contracts. In that work, they use the notion of "blame" between different components/modules. Basically: if some module A receives some data from a module B, and B provides "wrong" data, then B should be assigned blame. You don't want to end up in a situation where A receives the blame at the point where it passes that "wrong" data into another module C.

Yes, but ES is rampant with this sort of potentially misplaced blame.  We can debate whether such proper "blame" assignment is important or not, but I do believe this sort of very low level MOP interface is a situation where you want to absolutely minimize none essential work.  I'd sooner have it be a little bit more difficult to track now Proxy based bugs then to impact the performance of every correctly implemented proxy in every correct program. BTW this is a general statement about the entire proxy MOP and not just about these particularly property key access traps.

>> 
>> The "early" type coercions at the exit of the traps can be thought of in these terms: the invariant checks will always blame the proxy handler for producing wrong data, at the point where the wrong data is produced and before it's exposed to client code. Clients are expecting ES5/6 objects to obey an implicit "contract".

We are defining that contract right now. It would be a mistake to over specify it in ways that may unnecessarily impact performance.


>>  
>> getOwnPropertyDescriptor is also on my list to talk to you about for similar reasons.  I don't see why the property descriptor needs to be normalized and regenerated (including converting accessor properties on the descriptor into data properties and requiring the [[Prototype]] to be object.  I have an alternative for [[GetOwnProperty]]/TrapGetOwnPropertyDescriptor that preserves the original object returned from the trap (while normalizing to a descriptor record for any internal calls on ordinary objects).  This preserves any exotic property descriptor object properties as is.  This seems like exactly the right things.  Such exotic property descriptor properties can only be meaningfully used by other proxy traps or application/library code that knows about them.  I don't see any need to normalize them into data properties or otherwise muck with what the actually implementor of the getOwnPropertyDescrptor trap choose to return. 
>> 
>> I this alternative implementation is, again, much simpler (and efficient) with fewer externally observable calls in its algorithm steps.  I sure you will want to review it so I will include it in the spec. draft.
>> 
>> It's all a matter of how much invariants we care to give up.

The only invariants we have general agreement regarding enforce are those with respect to the visible properties of frozen/sealed target objects.

I think enforcement of additional invariant that don't directly related to those specific integrity constrains shouldn't be added.

Even without them, the core of the language is still memory safe and interoperability implementable.  From the spec. I can tell you exactly what is supposed to happen for any of these situations.

>> 
>> I think it's risky to allow descriptors to be returned whose attributes may be accessors, thus potentially fooling much existing ES5 code that often does simple tests like:
>> 
>> if (!desc.writable && !desc.configurable) {
>>   var val = desc.value; // we can assume the property is immutable, so we can cache its value
>>   ...
>> }
>> 
>> You can no longer rely on the correctness of such code if desc can't be guaranteed to be a plain old object containing only data properties.

If you want to enforce this as an invariant (after you've already done the fast check to make sure that the target is an high integrity object) I'm fine with that.  But, please don't but more checks in the fast path where we don't have these integrity requirements.

Also, in any situation like the above if you really care you can always do a Object.getOwnPropertyDescriptor and verify that you really are dealing with a own data property. Make the client that has the requirement pay the most rather than everybody else who doesn't.


>>> 
>>> c) on the other hand is crucial for the non-configurability/non-extensibility checks mentioned below. It's no use checking some invariants on a data structure if that data structure can later still be mutated.
>> 
>> I don't see how an extra copy of the the result array makes any difference in this regard  (also see below).   The array returned from Object.getOwnPropertyNames is still mutable even after you make a copy. It's only as stable as any other non-frozen object you are likely to encounter. 
>> 
>> The returned copy, even if it is mutable, is fresh, meaning that *no one else* yet has a pointer to it. Hence it can't be modified by the proxy handler.
>> 
>> If we don't make a copy, but we do check the result for invariants, here's what may happen:
>> 1) proxy checks whether the returned array contains all non-configurable properties of the target
>> 2) let's assume this is the case, so the proxy returns this array to the client
>> 3) the proxy handler kept a reference to the array, and mutates it, e.g. by removing a non-configurable property
>> 4) client is now wrongly under the assumption that the returned array contains all non-configurable properties

and then...
How is this likely to be exploited?  But if is exploitable, then I'd be content if you only enforce it for the high integrity object situations.


>>  
>>>  
>>> 3) Every name in the list returned by the trap code is looked up on the target to determine whether or not it exists, even if the target is extensible.   Each of those lookup is observable (the target might itself be a proxy) so, according to the algorithm they all must be performed.
>>> 
>>> This is where we get into actual checks required to enforce non-configurability/non-extensibility.
>>> 
>>> Granted, the ES5 spec is not clear about the invariants on getOwnPropertyNames and keys. The currently specified invariants are a common-sense extrapolation of the existing invariants to cover these operations.
>> 
>> Yes, but if they aren't essential then we have to take into account the cost of doing non-essential invariant checks.  These costs are both the actual runtime cost of doing the checks, the constraints upon implementation imposed by the need for complete and properly sequenced occurrence of all potentially externally observable algorithm steps, and finally (and least important) the added complexity of the specification which makes it harder to actually achieve interoperable implementations.
>> 
>> I agree about the cost. The question is whether we regard these invariants as essential or not. To me, they are the common-sense invariants an ES5 programmer would expect.

They may be common sense but not necessarily essential.  There is a cost to all such enforcement.



>>  
>>  
>>> In practice, it determines the degree of confidence that a programmer can have in Object.getOwnPropertyNames and friends when dealing with a frozen object. If we waive these invariant checks, then the result of those operations can never be trusted on to reliably introspect on a frozen object's list of property names:
>>> 
>>> Object.isFrozen(proxy) // true
>>> Object.getOwnPropertyNames(proxy) // ['foo']
>>> Object.getOwnPropertyNames(proxy) // [ ]
>>> 
>>> Here, the 'foo' property apparently disappeared on a frozen object.
>> 
>> Yes, but copying the result an extra time doesn't ensure that two separate calls produce the same result. 
>> 
>> It's not the copying alone that guarantees it, it's the invariant check that 'foo' is in the result + the copying that guarantees it.
>>  
>>> 
>>> If neither the for-in loop nor Object.getOwnPropertyNames nor Object.keys can reliably report an object's own properties, then we've made it impossible to reliably traverse and inspect a presumably deep-frozen object graph.
>> 
>> If that is the goal, that maybe what you need is a new operation that cannot be trapped.  Your algorithm is essentially doing all the work, and more, necessary to generate an accurate report of the actual non-configurable properties of the target.  If that is what is needed for your deep frozen traversals, add that operation.  That seems better than allowing the Proxy trap to do the work to report anything it wants, but then coming in after the fact and checking whether what was report matches what you demand.  In this case, it seems preferable to just provide the results you demand without any handler intervention. 
>> 
>> This seems to harken back to the original proxy design that essentially turned frozen objects into regular objects with no handlers. If that is what you need, you should provide access to the results you need for the situations where you need them.
>> 
>> But you should not be adding this overhead for Proxy uses that don't have those requirements.
>> 
>> The problem with, say, a non-trappable [[GetNonConfigurableOwnProperties]] internal method, is that membranes can't adapt. We were previously in this situation with the [[Extensible]] internal property. Eventually we ended up trapping Object.isExtensible and Object.preventExtensions, even though these traps can't actually return a result that is different from applying the operation directly to the target. They're still useful to trap, as the proxy may then respond to these operations and change the state of its own target before returning a result.
>> 
>> I'm pretty sure introducing a new non-trappable method is going to lead to the same problems for membranes.

Then have the trap but always ignore its results and call a non-trapping operation on  the target object to get the actual list of names.  You integrity check are all about non-configurable properties. Normal, non-high integrity proxies want integrity checking algorithms that are proportional to the number of non-configurable properties (usually 0) rather than the total number of properties.

>>>  
>>> 4) Every own property of the target, is observably looked up (possibly a second time) even if the object is extensible  and has no non-configurable properties.
>>> 
>>> We may be able to remove the redundancy of two lookups by restructuring the algorithm.
>>> There previously was some redundancy in other checks as well.
>>>  
>>> It isn't clear to me if any of this work is really necessary to ensure integrity.  After all, what can you do with any of these names other than use them as the property key argument to some other trap/internal method such as [[SetP]], [[DefineOwnProperty]], etc.  Called on a proxy, those fundamental operations are going to enforce the integrity invariants of the actual properties involved so the get name checks doesn't really seem to be adding anything essential.
>>> 
>>> Perhaps we can just get rid of all the above checking.  It seems like a good idea to me.
>>> 
>>> Alternatively,  it suggests that a [[GetNonConfigurablePropertyNames]] internal method/trap would be a useful call to have as the integrity invariants only care about non-configurable properties. That would significantly limit the work in the case where there are none and limit the observable trap calls to only the non-configurable properties.
>>> 
>>> That would be one way to speed things up.
>>> 
>>> I share your concern that the current spec algorithm may be a bit too restrictive for implementations to allow optimizations. It's very imperative. We should ask the spec implementors to have a look at the draft spec and share their concerns.
>> 
>> Maybe we should move this discussion to es-discuss.  What do you think?
>> 
>> Yes, I would be happy to continue the discussion in public.
>> 
>> Cheers,
>> Tom
> 
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20121120/d80f36f6/attachment-0001.html>


More information about the es-discuss mailing list