possible excessive proxy invariants for Object.keys/etc??

Allen Wirfs-Brock allen at wirfs-brock.com
Wed Nov 21 13:55:57 PST 2012


On Nov 21, 2012, at 12:09 PM, Tom Van Cutsem wrote:

>>> 2012/11/19 Allen Wirfs-Brock <allen at wirfs-brock.com>
>>> On Nov 19, 2012, at 10:04 AM, Tom Van Cutsem wrote:
>>> 
>>> It's all a matter of developer expectations and how much leeway we have in changing the "return type" of existing built-ins.
>>> In theory, it's a backwards-incompatible change. In practice, it may not matter.
> 
> It can't break existing code that doesn't use Proxies.
> 
> It's very likely that existing ES5 code (which doesn't use or know about proxies) will come to interact with ES6 proxies. Consider a sandboxed JS environment that provides a wrapped DOM (implemented using proxies). The ES5 code loaded into the sandbox doesn't actively "use" proxies; it will interact with proxies unknowingly. That's the kind of scenarios for which we should be careful with backwards-incompatible changes.

Presumably in this scenario it is the sandbox that is providing the Proxy so it can make to always return a built-in array if it is concerned about this.

The trade-off is always doing an extra copy of an object, just because of the unlikely possibility that a trap implementor doesn't use an actual built-in array.  

On the other hand, it seems very unlikely that realistic consumers of the existing Object.keys/Object.getOwnPropertyNames are actually doing an Array.isArray checks on the value produced from them. What would be the point?

>  
> 
>>>> b) that all the elements of the result are Strings
>>> 
>>> And presumably Symbols.  We have to also accommodate Symbols, at least for getOwnPropetyNames.  Regardless, why is this important?  More below...
>>> 
>>> Same argument as above.
>>> 
>>> I recall there was some concern about symbols showing up in existing reflection methods like Object.getOwnPropertyNames. Not sure how that got resolved. It's basically touching upon the same issue of whether we can change the return type of existing methods.
> 
> I'm assuming we are excluding symbols from [[Enumerate]] and [[Keys]] so it is only [[GetOwnPropertyNames]] that have this concern.
> 
> One way or another we have to be able to reflectively get a list of symbol keyed properties.
> 
> We can either:
> 1) include them in the Object.getOwnPropertyNames result.
> 2) Consider getOwnPropertyNames depreciated (it actually lives forever) and replace it with a new Object.getOwnPropertyKeys that includes non-private Symbol keys
> 3) Keep getOwnPropertyNames as is and add Object.getOwnPropertySymbols that only returns the non-private-Symbol keys.
> 
> 1) has the greatest backward compat concerns, but is the simplest to provide and for ES programmers to deal with going forward.
> 2) Eliminates the compat risk but creates perpetual potential for a: used gOPNames when I should have used gOPK  hazard.
> 3) Eliminates the compat risk but means everybody who wants to deal with all own properties have to deal with two lists of keys.  They also loose relative insertion order info relating key/symbol property keys.
> 
> Either 2 or 3 could require widening the MOP which increases the possibility for incomplete, inconsistent, or otherwise buggy proxies.  Even if we expose multiple public functions like either 2 or 3 I would still prefer to only have a single getOwnProperyyKeys internal method/trap that can be filtered to provide just strings and/or symbols.  Overall, I'm more concerned about keeping the internal method/trap MOP width as narrow as possible than I am about the size of the public Object.* API
> 
> Let's first discuss whether there is an issue with just letting (unique) symbols show up in Object.{keys,getOwnPropertyNames}.
> It's still a change in return type (from Array[String] to Array[String|Symbol]), but a far less innocuous one than changing Array[String] into "Any".

Wow, I actually think that the Array[String|Symbol] change is far more likely to have actual compatibility impact.  I can imagine various scenarios where is program or library is doing string processing on the elements returned from keys/getOPN.  That seems far more likely, then that they have dependencies upon the returned collection being Array.isArray rather than just Array-like.

> 
>>>  
>>> 
>>>> c) to ensure the stability of the result.
>>>> 
>>>> You can think of a + b as implementing a type coercion of the trap result to "Array of String". This coercion is not too dissimilar from what the getOwnPropertyDescriptor has to do (normalization of the returned property descriptor by creating a fresh copy).
>>> 
>>> Yes, premature type coercion, in my opinion. Also, a classic performance mistakes made by dynamic language programmers:  unnecessary coercion of values that are never going to be access or redundant coercion checks of values that are already of the property type.  Why is it important to do such checks on values that are are just passing through these traps.  When and if somebody actually gets arounds to using one of the elements that are returned as a property key they will be automatically coerced to a valid value.  Why is it important that it happens any sooner than that.
>>> 
>>> I don't know if you are familiar with the work of Felleisen et al. on higher-order contracts. In that work, they use the notion of "blame" between different components/modules. Basically: if some module A receives some data from a module B, and B provides "wrong" data, then B should be assigned blame. You don't want to end up in a situation where A receives the blame at the point where it passes that "wrong" data into another module C.
> 
> Yes, but ES is rampant with this sort of potentially misplaced blame.  We can debate whether such proper "blame" assignment is important or not, but I do believe this sort of very low level MOP interface is a situation where you want to absolutely minimize none essential work.  I'd sooner have it be a little bit more difficult to track now Proxy based bugs then to impact the performance of every correctly implemented proxy in every correct program. BTW this is a general statement about the entire proxy MOP and not just about these particularly property key access traps.
> 
> It's true that most proxies will probably be benign and thus most dynamic invariant checks will succeed. But we should not remove checks just because they succeed 99.9% of the time. By analogy, most arguments passed to the + operator are valid operands, but that doesn't obviate the runtime type-check it must make over and over again (barring optimization). In the case of the + operator a forgotten type-check would lead to a crash or memory-unsafe behavior. In the case of a forgotten invariant check for proxies, the consequences are far less severe, but may lead to inconsistent behavior none the less.
> 
There is a big difference between essential checks that are needed for memory safety and checks that are needed to maintain some of these proxy invariants.  As you say, one will cause system level crashes or unsafe memory accesses.  The other just introduces application level bugs. 

Essential runtime checks are always should be designed to be as inexpensive as possible and for that reason are always very simple.  Complex relationship validation belongs at the application level.  You might do such validation in your sandbox implementation (for example providing your own Object.getOwnPropertyNames that does a copy to a fresh array) but it shouldn't be forced upon applications that don't need it.


>>> 
>>> The "early" type coercions at the exit of the traps can be thought of in these terms: the invariant checks will always blame the proxy handler for producing wrong data, at the point where the wrong data is produced and before it's exposed to client code. Clients are expecting ES5/6 objects to obey an implicit "contract".
> 
> We are defining that contract right now. It would be a mistake to over specify it in ways that may unnecessarily impact performance.
> 
> 
>>>  
>>> getOwnPropertyDescriptor is also on my list to talk to you about for similar reasons.  I don't see why the property descriptor needs to be normalized and regenerated (including converting accessor properties on the descriptor into data properties and requiring the [[Prototype]] to be object.  I have an alternative for [[GetOwnProperty]]/TrapGetOwnPropertyDescriptor that preserves the original object returned from the trap (while normalizing to a descriptor record for any internal calls on ordinary objects).  This preserves any exotic property descriptor object properties as is.  This seems like exactly the right things.  Such exotic property descriptor properties can only be meaningfully used by other proxy traps or application/library code that knows about them.  I don't see any need to normalize them into data properties or otherwise muck with what the actually implementor of the getOwnPropertyDescrptor trap choose to return. 
>>> 
>>> I this alternative implementation is, again, much simpler (and efficient) with fewer externally observable calls in its algorithm steps.  I sure you will want to review it so I will include it in the spec. draft.
>>> 
>>> It's all a matter of how much invariants we care to give up.
> 
> The only invariants we have general agreement regarding enforce are those with respect to the visible properties of frozen/sealed target objects.
> 
> Agreed! Just to be clear, the invariants we were originally discussing were those related to getOwnPropertyNames/keys/enumerate.
> For getOwnPropertyNames, the invariant that I specced was that the trap *must* return all non-configurable properties of the target, and *may not* return any new properties that don't exist on a non-extensible target.
> 
> To me, this invariant is directly related to "the visible properties of frozen/sealed target objects".
>  
> I think enforcement of additional invariant that don't directly related to those specific integrity constrains shouldn't be added.
> 
> Why do you not consider the above invariant related to what it means for a property to be non-configurable or an object to be non-extensible?
> To me, they are directly related.

The issue is that (as currently defined) it imposes what seems like significant algorithmic complexity on applications that don't actually care about that invariant.

The root question that I think I posed is:  Are you sure it actually impacts the integrity of the use cases you have in mind, if this invariant was not checked for keys/getOPN?  After all, you are presumably going to use those names to perform operations at the individual property level on proxies and those individual operations are all doing equivalent integrity checks.

If the real integrity invariants are being maintained at the individual property level, then maybe checking them in keys/getOPN is just an (expensive) security blanket that isn't essential.

>  
> Even without them, the core of the language is still memory safe and interoperability implementable.  From the spec. I can tell you exactly what is supposed to happen for any of these situations.
> 
>>> 
>>> I think it's risky to allow descriptors to be returned whose attributes may be accessors, thus potentially fooling much existing ES5 code that often does simple tests like:
>>> 
>>> if (!desc.writable && !desc.configurable) {
>>>   var val = desc.value; // we can assume the property is immutable, so we can cache its value
>>>   ...
>>> }
>>> 
>>> You can no longer rely on the correctness of such code if desc can't be guaranteed to be a plain old object containing only data properties.
> 
> If you want to enforce this as an invariant (after you've already done the fast check to make sure that the target is an high integrity object) I'm fine with that.  But, please don't but more checks in the fast path where we don't have these integrity requirements.
> 
> Also, in any situation like the above if you really care you can always do a Object.getOwnPropertyDescriptor and verify that you really are dealing with a own data property. Make the client that has the requirement pay the most rather than everybody else who doesn't.
> 
> No, that wouldn't work. If we don't enforce that property descriptors returned from Object.getOwnPropertyDescriptor are normalized, you can imagine that the returned descriptor is *itself* a proxy, which, when asked for *its* own properties, returns yet another proxy that may again lie about the description of its own attributes (we're talking about the attributes of the attributes of a property here).

Yes, I assume that the descriptor object may themselves be proxies.  But aren't the proxies you really care about then ones that you are implementing as part of your sandbox implementation.  Those you have control over.  What does it matter to you if some random application object is implemented by a proxy with n-levels of indirection.  

> 
> This all sounds incredibly theoretical, but the fact of the matter is this: if you don't enforce the language invariants directly on the basic primitives provided by the language (such as Object.getOwnPropertyDescriptor), then the cat is out of the bag and you can no longer reason about integrity. Any further checks you want to make to build up more confidence are themselves made with unreliable primitives, so they provide only a false sense of security.

Hence my above question, is getOPN really one of these basic primitives?

> 
>  
>>>> 
>>>> c) on the other hand is crucial for the non-configurability/non-extensibility checks mentioned below. It's no use checking some invariants on a data structure if that data structure can later still be mutated.
>>> 
>>> I don't see how an extra copy of the the result array makes any difference in this regard  (also see below).   The array returned from Object.getOwnPropertyNames is still mutable even after you make a copy. It's only as stable as any other non-frozen object you are likely to encounter. 
>>> 
>>> The returned copy, even if it is mutable, is fresh, meaning that *no one else* yet has a pointer to it. Hence it can't be modified by the proxy handler.
>>> 
>>> If we don't make a copy, but we do check the result for invariants, here's what may happen:
>>> 1) proxy checks whether the returned array contains all non-configurable properties of the target
>>> 2) let's assume this is the case, so the proxy returns this array to the client
>>> 3) the proxy handler kept a reference to the array, and mutates it, e.g. by removing a non-configurable property
>>> 4) client is now wrongly under the assumption that the returned array contains all non-configurable properties
> 
> and then...
> How is this likely to be exploited?  But if is exploitable, then I'd be content if you only enforce it for the high integrity object situations.
> 
> I can't give you a concrete exploit. I can only make the general observation that sandboxes such as Caja are written entirely in Javascript. I'm not sure this could be done without the language providing foolproof primitives. That's why I think it's important we think carefully about the values returned by traps.
>  
>>>>  
>>>> 3) Every name in the list returned by the trap code is looked up on the target to determine whether or not it exists, even if the target is extensible.   Each of those lookup is observable (the target might itself be a proxy) so, according to the algorithm they all must be performed.
>>>> 
>>>> This is where we get into actual checks required to enforce non-configurability/non-extensibility.
>>>> 
>>>> Granted, the ES5 spec is not clear about the invariants on getOwnPropertyNames and keys. The currently specified invariants are a common-sense extrapolation of the existing invariants to cover these operations.
>>> 
>>> Yes, but if they aren't essential then we have to take into account the cost of doing non-essential invariant checks.  These costs are both the actual runtime cost of doing the checks, the constraints upon implementation imposed by the need for complete and properly sequenced occurrence of all potentially externally observable algorithm steps, and finally (and least important) the added complexity of the specification which makes it harder to actually achieve interoperable implementations.
>>> 
>>> I agree about the cost. The question is whether we regard these invariants as essential or not. To me, they are the common-sense invariants an ES5 programmer would expect.
> 
> They may be common sense but not necessarily essential.  There is a cost to all such enforcement.
> 
> As I argued earlier, you need at least 1 reliable primitive that is guaranteed to give you all the property names of a proxy that pretends to be frozen. Otherwise you can't reliably walk deep-frozen object graphs.
> 
> I'm fine with Object.getOwnPropertyNames being that reliable primitive, and perhaps relaxing the invariant on Object.keys and the for-in loop (the enumerate trap).
>>>  
>>>  
>>>> In practice, it determines the degree of confidence that a programmer can have in Object.getOwnPropertyNames and friends when dealing with a frozen object. If we waive these invariant checks, then the result of those operations can never be trusted on to reliably introspect on a frozen object's list of property names:
>>>> 
>>>> Object.isFrozen(proxy) // true
>>>> Object.getOwnPropertyNames(proxy) // ['foo']
>>>> Object.getOwnPropertyNames(proxy) // [ ]
>>>> 
>>>> Here, the 'foo' property apparently disappeared on a frozen object.
>>> 
>>> Yes, but copying the result an extra time doesn't ensure that two separate calls produce the same result. 
>>> 
>>> It's not the copying alone that guarantees it, it's the invariant check that 'foo' is in the result + the copying that guarantees it.
>>>  
>>>> 
>>>> If neither the for-in loop nor Object.getOwnPropertyNames nor Object.keys can reliably report an object's own properties, then we've made it impossible to reliably traverse and inspect a presumably deep-frozen object graph.
>>> 
>>> If that is the goal, that maybe what you need is a new operation that cannot be trapped.  Your algorithm is essentially doing all the work, and more, necessary to generate an accurate report of the actual non-configurable properties of the target.  If that is what is needed for your deep frozen traversals, add that operation.  That seems better than allowing the Proxy trap to do the work to report anything it wants, but then coming in after the fact and checking whether what was report matches what you demand.  In this case, it seems preferable to just provide the results you demand without any handler intervention. 
>>> 
>>> This seems to harken back to the original proxy design that essentially turned frozen objects into regular objects with no handlers. If that is what you need, you should provide access to the results you need for the situations where you need them.
>>> 
>>> But you should not be adding this overhead for Proxy uses that don't have those requirements.
>>> 
>>> The problem with, say, a non-trappable [[GetNonConfigurableOwnProperties]] internal method, is that membranes can't adapt. We were previously in this situation with the [[Extensible]] internal property. Eventually we ended up trapping Object.isExtensible and Object.preventExtensions, even though these traps can't actually return a result that is different from applying the operation directly to the target. They're still useful to trap, as the proxy may then respond to these operations and change the state of its own target before returning a result.
>>> 
>>> I'm pretty sure introducing a new non-trappable method is going to lead to the same problems for membranes.
> 
> Then have the trap but always ignore its results and call a non-trapping operation on  the target object to get the actual list of names.  You integrity check are all about non-configurable properties. Normal, non-high integrity proxies want integrity checking algorithms that are proportional to the number of non-configurable properties (usually 0) rather than the total number of properties.
> 
> Details matter in this case. Let's go over them. The invariant for getOwnPropertyNames consists of three parts:
> 1) check whether the trap result contains *all* non-configurable target properties
> 2) if the target is non-extensible, check whether the trap result contains *no* new properties beyond those defined on the target
> 3) if the target is non-extensible, check whether *all* existing target properties are in the result (if not, a proxy may make it appear that a property was "deleted" on a non-extensible object, which is otherwhise impossible)
> 
> Part #1 is proportional to the number of non-configurable target properties (call this number N)
not P?  at least in the generic case you have to get the property descriptor of every property in order to determine which are non-configurable. But I guess it doesn't change the answer
> Part #2 is proportional to the number of properties returned in the trap result (call this M)
> Part #3 is proportional to the total number of target properties (whether they be configurable or not) (call this P)
> 
> For a non-extensible target, the total cost is O(N + M + P), and since N <= P, the cost is O(M + P). In other words, the "optimization" of using a [[GetNonConfigurableOwnProperties]] method in Part #1 is not very effective, as Part #3 must visit all own properties anyway.
> 
> For an extensible target, I agree [[GetNonConfigurableOwnProperties]] would introduce a payoff, reducing the cost of the check from O(P) to O(N).

It still appears to me that for Sealed/Frozen objects, which I believe are the ones you actually care about, you are enforcing that the list returned is exactly the own properties of the ultimate ordinary target object (perhaps with multiple levels of proxy indirection).  In otherwords, the result cannot be meaningfully changed.  In that case it doesn't need to be trap.

Allen


> 
> Cheers,
> Tom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20121121/62635365/attachment-0001.html>


More information about the es-discuss mailing list