possible excessive proxy invariants for Object.keys/etc??

Tom Van Cutsem tomvc.be at gmail.com
Thu Nov 22 05:36:53 PST 2012

2012/11/21 Allen Wirfs-Brock <allen at wirfs-brock.com>

> On Nov 21, 2012, at 12:09 PM, Tom Van Cutsem wrote:
> Let's first discuss whether there is an issue with just letting (unique)
> symbols show up in Object.{keys,getOwnPropertyNames}.
> It's still a change in return type (from Array[String] to
> Array[String|Symbol]), but a far less innocuous one than changing
> Array[String] into "Any".
> Wow, I actually think that the Array[String|Symbol] change is far more
> likely to have actual compatibility impact.  I can imagine various
> scenarios where is program or library is doing string processing on the
> elements returned from keys/getOPN.  That seems far more likely, then that
> they have dependencies upon the returned collection being Array.isArray
> rather than just Array-like.

If we don't do any invariant checks, the trap isn't even obliged to return
an array-like. Code that expects an array-like is likely to immediately
crash if this happens though, so I'm not sure if that's a real issue.

>>> c) to ensure the stability of the result.
>>> You can think of a + b as implementing a type coercion of the trap
>>> result to "Array of String". This coercion is not too dissimilar from what
>>> the getOwnPropertyDescriptor has to do (normalization of the returned
>>> property descriptor by creating a fresh copy).
>>> Yes, premature type coercion, in my opinion. Also, a classic performance
>>> mistakes made by dynamic language programmers:  unnecessary coercion of
>>> values that are never going to be access or redundant coercion checks of
>>> values that are already of the property type.  Why is it important to do
>>> such checks on values that are are just passing through these traps.  When
>>> and if somebody actually gets arounds to using one of the elements that are
>>> returned as a property key they will be automatically coerced to a valid
>>> value.  Why is it important that it happens any sooner than that.
>> I don't know if you are familiar with the work of Felleisen et al. on
>> higher-order contracts. In that work, they use the notion of "blame"
>> between different components/modules. Basically: if some module A receives
>> some data from a module B, and B provides "wrong" data, then B should be
>> assigned blame. You don't want to end up in a situation where A receives
>> the blame at the point where it passes that "wrong" data into another
>> module C.
>> Yes, but ES is rampant with this sort of potentially misplaced blame.  We
>> can debate whether such proper "blame" assignment is important or not, but
>> I do believe this sort of very low level MOP interface is a situation where
>> you want to absolutely minimize none essential work.  I'd sooner have it be
>> a little bit more difficult to track now Proxy based bugs then to impact
>> the performance of every correctly implemented proxy in every correct
>> program. BTW this is a general statement about the entire proxy MOP and not
>> just about these particularly property key access traps.
> It's true that most proxies will probably be benign and thus most dynamic
> invariant checks will succeed. But we should not remove checks just because
> they succeed 99.9% of the time. By analogy, most arguments passed to the +
> operator are valid operands, but that doesn't obviate the runtime
> type-check it must make over and over again (barring optimization). In the
> case of the + operator a forgotten type-check would lead to a crash or
> memory-unsafe behavior. In the case of a forgotten invariant check for
> proxies, the consequences are far less severe, but may lead to inconsistent
> behavior none the less.
> There is a big difference between essential checks that are needed for
> memory safety and checks that are needed to maintain some of these proxy
> invariants.  As you say, one will cause system level crashes or unsafe
> memory accesses.  The other just introduces application level bugs.
> Essential runtime checks are always should be designed to be as
> inexpensive as possible and for that reason are always very simple.
>  Complex relationship validation belongs at the application level.  You
> might do such validation in your sandbox implementation (for example
> providing your own Object.getOwnPropertyNames that does a copy to a fresh
> array) but it shouldn't be forced upon applications that don't need it.
>> The "early" type coercions at the exit of the traps can be thought of in
>> these terms: the invariant checks will always blame the proxy handler for
>> producing wrong data, at the point where the wrong data is produced and
>> before it's exposed to client code. Clients are expecting ES5/6 objects to
>> obey an implicit "contract".
>> We are defining that contract right now. It would be a mistake to over
>> specify it in ways that may unnecessarily impact performance.
>>> getOwnPropertyDescriptor is also on my list to talk to you about for
>>> similar reasons.  I don't see why the property descriptor needs to be
>>> normalized and regenerated (including converting accessor properties on the
>>> descriptor into data properties and requiring the [[Prototype]] to be
>>> object.  I have an alternative for
>>> [[GetOwnProperty]]/TrapGetOwnPropertyDescriptor that preserves the original
>>> object returned from the trap (while normalizing to a descriptor record for
>>> any internal calls on ordinary objects).  This preserves any exotic
>>> property descriptor object properties as is.  This seems like exactly the
>>> right things.  Such exotic property descriptor properties can only be
>>> meaningfully used by other proxy traps or application/library code that
>>> knows about them.  I don't see any need to normalize them into data
>>> properties or otherwise muck with what the actually implementor of the
>>> getOwnPropertyDescrptor trap choose to return.
>>> I this alternative implementation is, again, much simpler (and
>>> efficient) with fewer externally observable calls in its algorithm steps.
>>>  I sure you will want to review it so I will include it in the spec. draft.
>> It's all a matter of how much invariants we care to give up.
>> The only invariants we have general agreement regarding enforce are those
>> with respect to the visible properties of frozen/sealed target objects.
> Agreed! Just to be clear, the invariants we were originally discussing
> were those related to getOwnPropertyNames/keys/enumerate.
> For getOwnPropertyNames, the invariant that I specced was that the trap
> *must* return all non-configurable properties of the target, and *may not*
> return any new properties that don't exist on a non-extensible target.
> To me, this invariant is directly related to "the visible properties of
> frozen/sealed target objects".
>> I think enforcement of additional invariant that don't directly related
>> to those specific integrity constrains shouldn't be added.
> Why do you not consider the above invariant related to what it means for a
> property to be non-configurable or an object to be non-extensible?
> To me, they are directly related.
> The issue is that (as currently defined) it imposes what seems like
> significant algorithmic complexity on applications that don't actually care
> about that invariant.
> The root question that I think I posed is:  Are you sure it actually
> impacts the integrity of the use cases you have in mind, if this invariant
> was not checked for keys/getOPN?  After all, you are presumably going to
> use those names to perform operations at the individual property level on
> proxies and those individual operations are all doing equivalent integrity
> checks.
> If the real integrity invariants are being maintained at the individual
> property level, then maybe checking them in keys/getOPN is just an
> (expensive) security blanket that isn't essential.

A concrete example: say I'm implementing a membrane that wants to allow
frozen "records" to pass through the membrane unwrapped.
By "record", I mean an object with only data properties bound to primitives
(strings, numbers, ...). Since individual strings and numbers can pass
through unwrapped, it's not entirely unreasonable to allow compound values
composed of only primitives to be passed through unwrapped as well.

To pass such frozen records, the membrane needs to perform a check to see
whether the frozen object is indeed a record.
To do so, it must be able to iterate over all the properties of the object
and test whether they are indeed data properties bound to primitives.

If the frozen record is a proxy that can lie about its own properties, it
can not report a particular "foo" property that is bound to a mutable
object, thus opening a communications channel piercing the membrane.

>> Even without them, the core of the language is still memory safe and
>> interoperability implementable.  From the spec. I can tell you exactly what
>> is supposed to happen for any of these situations.
>> I think it's risky to allow descriptors to be returned whose attributes
>> may be accessors, thus potentially fooling much existing ES5 code that
>> often does simple tests like:
>> if (!desc.writable && !desc.configurable) {
>>   var val = desc.value; // we can assume the property is immutable, so we
>> can cache its value
>>   ...
>> }
>> You can no longer rely on the correctness of such code if desc can't be
>> guaranteed to be a plain old object containing only data properties.
>> If you want to enforce this as an invariant (after you've already done
>> the fast check to make sure that the target is an high integrity object)
>> I'm fine with that.  But, please don't but more checks in the fast path
>> where we don't have these integrity requirements.
>> Also, in any situation like the above if you really care you can always
>> do a Object.getOwnPropertyDescriptor and verify that you really are dealing
>> with a own data property. Make the client that has the requirement pay the
>> most rather than everybody else who doesn't.
> No, that wouldn't work. If we don't enforce that property descriptors
> returned from Object.getOwnPropertyDescriptor are normalized, you can
> imagine that the returned descriptor is *itself* a proxy, which, when asked
> for *its* own properties, returns yet another proxy that may again lie
> about the description of its own attributes (we're talking about the
> attributes of the attributes of a property here).
> Yes, I assume that the descriptor object may themselves be proxies.  But
> aren't the proxies you really care about then ones that you are
> implementing as part of your sandbox implementation.  Those you have
> control over.  What does it matter to you if some random application object
> is implemented by a proxy with n-levels of indirection.

The proxies I'm worried about are precisely those that an application
doesn't have control over. Those are the ones that may confuse ES5 code
relying on ES5 invariants.

> This all sounds incredibly theoretical, but the fact of the matter is
> this: if you don't enforce the language invariants directly on the basic
> primitives provided by the language (such as
> Object.getOwnPropertyDescriptor), then the cat is out of the bag and you
> can no longer reason about integrity. Any further checks you want to make
> to build up more confidence are themselves made with unreliable primitives,
> so they provide only a false sense of security.
> Hence my above question, is getOPN really one of these basic primitives?

Without at least 1 operation that reliably lists an object's properties,
you can't reliably introspect a frozen object.

>>> c) on the other hand is crucial for the
>>> non-configurability/non-extensibility checks mentioned below. It's no use
>>> checking some invariants on a data structure if that data structure can
>>> later still be mutated.
>>> I don't see how an extra copy of the the result array makes any
>>> difference in this regard  (also see below).   The array returned from
>>> Object.getOwnPropertyNames is still mutable even after you make a copy.
>>> It's only as stable as any other non-frozen object you are likely to
>>> encounter.
>> The returned copy, even if it is mutable, is fresh, meaning that *no one
>> else* yet has a pointer to it. Hence it can't be modified by the proxy
>> handler.
>> If we don't make a copy, but we do check the result for invariants,
>> here's what may happen:
>> 1) proxy checks whether the returned array contains all non-configurable
>> properties of the target
>> 2) let's assume this is the case, so the proxy returns this array to the
>> client
>> 3) the proxy handler kept a reference to the array, and mutates it, e.g.
>> by removing a non-configurable property
>> 4) client is now wrongly under the assumption that the returned array
>> contains all non-configurable properties
>> and then...
>> How is this likely to be exploited?  But if is exploitable, then I'd be
>> content if you only enforce it for the high integrity object situations.
> I can't give you a concrete exploit. I can only make the general
> observation that sandboxes such as Caja are written entirely in Javascript.
> I'm not sure this could be done without the language providing foolproof
> primitives. That's why I think it's important we think carefully about the
> values returned by traps.
>>>> 3) Every name in the list returned by the trap code is looked up on the
>>>> target to determine whether or not it exists, even if the target is
>>>> extensible.   Each of those lookup is observable (the target might itself
>>>> be a proxy) so, according to the algorithm they all must be performed.
>>> This is where we get into actual checks required to enforce
>>> non-configurability/non-extensibility.
>>> Granted, the ES5 spec is not clear about the invariants on
>>> getOwnPropertyNames and keys. The currently specified invariants are a
>>> common-sense extrapolation of the existing invariants to cover these
>>> operations.
>>> Yes, but if they aren't essential then we have to take into account the
>>> cost of doing non-essential invariant checks.  These costs are both the
>>> actual runtime cost of doing the checks, the constraints upon
>>> implementation imposed by the need for complete and properly sequenced
>>> occurrence of all potentially externally observable algorithm steps, and
>>> finally (and least important) the added complexity of the specification
>>> which makes it harder to actually achieve interoperable implementations.
>> I agree about the cost. The question is whether we regard these
>> invariants as essential or not. To me, they are the common-sense invariants
>> an ES5 programmer would expect.
>> They may be common sense but not necessarily essential.  There is a cost
>> to all such enforcement.
> As I argued earlier, you need at least 1 reliable primitive that is
> guaranteed to give you all the property names of a proxy that pretends to
> be frozen. Otherwise you can't reliably walk deep-frozen object graphs.
> I'm fine with Object.getOwnPropertyNames being that reliable primitive,
> and perhaps relaxing the invariant on Object.keys and the for-in loop (the
> enumerate trap).
>>> In practice, it determines the degree of confidence that a programmer
>>> can have in Object.getOwnPropertyNames and friends when dealing with a
>>> frozen object. If we waive these invariant checks, then the result of those
>>> operations can never be trusted on to reliably introspect on a frozen
>>> object's list of property names:
>>> Object.isFrozen(proxy) // true
>>> Object.getOwnPropertyNames(proxy) // ['foo']
>>> Object.getOwnPropertyNames(proxy) // [ ]
>>> Here, the 'foo' property apparently disappeared on a frozen object.
>>> Yes, but copying the result an extra time doesn't ensure that two
>>> separate calls produce the same result.
>> It's not the copying alone that guarantees it, it's the invariant check
>> that 'foo' is in the result + the copying that guarantees it.
>>> If neither the for-in loop nor Object.getOwnPropertyNames nor
>>> Object.keys can reliably report an object's own properties, then we've made
>>> it impossible to reliably traverse and inspect a presumably deep-frozen
>>> object graph.
>>> If that is the goal, that maybe what you need is a new operation that
>>> cannot be trapped.  Your algorithm is essentially doing all the work, and
>>> more, necessary to generate an accurate report of the actual
>>> non-configurable properties of the target.  If that is what is needed for
>>> your deep frozen traversals, add that operation.  That seems better than
>>> allowing the Proxy trap to do the work to report anything it wants, but
>>> then coming in after the fact and checking whether what was report matches
>>> what you demand.  In this case, it seems preferable to just provide the
>>> results you demand without any handler intervention.
>>> This seems to harken back to the original proxy design that essentially
>>> turned frozen objects into regular objects with no handlers. If that is
>>> what you need, you should provide access to the results you need for the
>>> situations where you need them.
>>> But you should not be adding this overhead for Proxy uses that don't
>>> have those requirements.
>> The problem with, say, a non-trappable
>> [[GetNonConfigurableOwnProperties]] internal method, is that membranes
>> can't adapt. We were previously in this situation with the [[Extensible]]
>> internal property. Eventually we ended up trapping Object.isExtensible and
>> Object.preventExtensions, even though these traps can't actually return a
>> result that is different from applying the operation directly to the
>> target. They're still useful to trap, as the proxy may then respond to
>> these operations and change the state of its own target before returning a
>> result.
>> I'm pretty sure introducing a new non-trappable method is going to lead
>> to the same problems for membranes.
>> Then have the trap but always ignore its results and call a non-trapping
>> operation on  the target object to get the actual list of names.  You
>> integrity check are all about non-configurable properties. Normal, non-high
>> integrity proxies want integrity checking algorithms that are proportional
>> to the number of non-configurable properties (usually 0) rather than the
>> total number of properties.
> Details matter in this case. Let's go over them. The invariant for
> getOwnPropertyNames consists of three parts:
> 1) check whether the trap result contains *all* non-configurable target
> properties
> 2) if the target is non-extensible, check whether the trap result contains
> *no* new properties beyond those defined on the target
> 3) if the target is non-extensible, check whether *all* existing target
> properties are in the result (if not, a proxy may make it appear that a
> property was "deleted" on a non-extensible object, which is otherwhise
> impossible)
> Part #1 is proportional to the number of non-configurable target
> properties (call this number N)
> not P?  at least in the generic case you have to get the property
> descriptor of every property in order to determine which are
> non-configurable. But I guess it doesn't change the answer

Yes, sorry, should have been more explicit: with only
Object.getOwnPropertyNames, this is proportional to P. I was operating
under the assumption of having a [[GetOwnNonConfigurableProperties]] method
as you suggested earlier.

> Part #2 is proportional to the number of properties returned in the trap
> result (call this M)
> Part #3 is proportional to the total number of target properties (whether
> they be configurable or not) (call this P)
> For a non-extensible target, the total cost is O(N + M + P), and since N
> <= P, the cost is O(M + P). In other words, the "optimization" of using a
> [[GetNonConfigurableOwnProperties]] method in Part #1 is not very
> effective, as Part #3 must visit all own properties anyway.
> For an extensible target, I agree [[GetNonConfigurableOwnProperties]]
> would introduce a payoff, reducing the cost of the check from O(P) to O(N).
> It still appears to me that for Sealed/Frozen objects, which I believe are
> the ones you actually care about, you are enforcing that the list returned
> is exactly the own properties of the ultimate ordinary target object
> (perhaps with multiple levels of proxy indirection).  In otherwords, the
> result cannot be meaningfully changed.  In that case it doesn't need to be
> trap.

You're right that the result cannot be meaningfully changed for frozen
objects. But it still needs to be a trap for membranes. The issue is again
that the membrane proxy needs to be notified of the operation occurring, so
that it can update its target object before the operation proceeds. Like
Brandon mentioned earlier in this thread, the trap is effectively just a
notification callback at that point.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20121122/78aa123b/attachment-0001.html>

More information about the es-discuss mailing list