Direct proxies update

Allen Wirfs-Brock allen at wirfs-brock.com
Tue Nov 29 21:56:53 PST 2011


On Nov 30, 2011, at 10:24 AM, David Bruant wrote:

> Le 29/11/2011 23:07, Allen Wirfs-Brock a écrit :
>> 
>> ...
>> Objects serve as one of our primary abstraction mechanisms (the other is functions and function closures have similar allocation issues). Anytime you you tell programmers not to allocate you take away their ability use abstraction to deal with complexity.
> I agree with you with some restictions.
> - For a native API, the cost of function closure is null (since the function does not need a scope to capture variable)
> - Objects are an interesting abstraction as long as they have a state.
> For the specific example of Reflection API, the stateless API that Tom started seems to prove that a reflection API does not need a state. In that case, why bothering allocating objects?

The state is explicitly passed as arguments.  Most important is the first argument that identifies the object.  The client must keep track of this state and explicitly associate it with each call.  Clients have been known to make mistakes and pass the wrong object to such methods. One of the things that an object based API does is make the association of between that state and the functions implicit by encapsulating the state and the functions together as an object and automatically associating them during method calls.  This makes it easy for clients to do things that are hard given the other approach.  For example, it allows a client to be written to that is capable of transparently dealing with different implementations of a common API.  In an earlier message I described the example of an "inspector" client that is able to display information about objects without knowing where or how the object is implemented.  A different reason for using objects in a reflection API is so you can easily attenuate authority.   For example, for many clients it may be sufficient to provide them with non-mutating mirrors that only allow inspection.  They do this by excluding from the mirror objects all mutation methods.

> That's the same reason why math functions are properties of the Math object and not "math objects".

Which works fine as long as you only have one kind of number.  But if you add multiple numeric data types then you are either going to have to have additional Math objects (ArbitraryPrecisionMath, DecimalFloatMath, etc), have generic functions (a dual of objects), or turn them into methods.

> However, having an object-oriented DOM makes a lot of sense to me since objects have a state (children, node type, etc.). I'm not sure we could easily and conviniently turn the DOM into a set of stateless functions.

The same way you do it in C or Pascal or assembly languages.  You have state (often structs) and functions and try to make sure you always call the appropriate functions with the right kind of state. That's what objects do for you.  They automates the necessary house keeping.  
> 
>> A good GC should (and can) make allocation and reclamation of highly ephemeral objects so cheap that developers simply shouldn't worry about it.
> I agree on the reclamation part, but I don't understand what a GC can do about allocation of ephemeral (or not) objects.

A good bump allocator simply has a linear memory area where objects all allocated simply by "bumping" the pointer to the next available slot.  If you need to allocated a three slot object you just increment the allocation point by (3+h)*slotSize, fill in the object slots, and finally compare against a upper bound.  This is actually quite similar to how local variables are allocated on the stack.  h is the number of overhead needed to form an "object header" so the slots can be processed as an object.  Header size is dependent upon trade-offs in the overall design.  2 is a pretty good value, 1 is possible,  3 or more suggests that there may be room to tighten up the design.   For JS, you have to assume that you are on a code path that is not enough that the implementation has actually been able to assign a "shape" to the object (in this case knows that it has t3 slots, etc.) that is being allocated.  (It you aren't on such a hot path why do you care).

> 
>> This is not to say that there are no situations where excessive allocations may cause performance issues but such situations should be outliers that only need to be dealt with when they are actually identified as being a bottleneck.  To over simplify: a good bump allocation makes object creation nearly as efficient as assigning to local variables and a good multi-generation ephemeral collector has a GC cost that is proportional to the number of retained objects not the number of allocated objects. Objects that are created and discarded within the span of a single ephemeral collection cycle should have a very low cost.  This has all been demonstrated in high perf memory managers for Smalltalk and Lisp.
> If a garbage collection is triggered when a generation is full, then, your GC cost remains proportional to your number of allocation.

Typically, a ephemeral GC would be trigger when the bump point exceeds the limit (perhaps after doing so, and switch to a new allocation zone several times.)

However, GC cost isn't usually proportional to the number of allocations.  Programs typically reach a steady state where the number of ephemeral objects that survive stabilizes at some level (actually most programs shift over time between several steady state phases).  When a program is in such a stead state, once you exceed a base threshold changing the frequency of GC doesn't really change how many ephemeral objects will survive a collection. The execution time of a copying collector is proportional to the number of surviving objects (garbage objects are just left behind, untouched).  So the size of the allocation zone determines how frequently a GC is done, but the actual cost of a GC is some fixed overhead to enter/leave the GC plus the cost of scavenging the surviving objects.  Bigger allocation zones means less GC total overhead, but individual GCs cost about the same, no matter how frequently they are performed or how many object are allocated between them. 

> 
> If a garbage collection is triggered at constant intervals, then it probably runs for nothing (or too few) too often.
> 
>>> I don't know what the exact status of implementations is, but what happens in current JS engines when the expression '[].forEach.call' is met? Is the allocation of an array actually performed? Hopefully not, I would not be surprised if it was.
>> 
>> I suspect they don't optimize this although arguably they should. However, if you buy my argument then it really doesn't make much difference.  Implementations should put the effort into building better GCs.
> For this particular case where the object is not ephemeral, but completely useless, a GC will still cost you something (even if very small), while static analysis can tell you to not allocate at all. I'm not talking about a smaller cost of allocation+discard, but nullifying it with a constant (and small) time of static analysis.
> -----
> var a = [1];
> function f(e, i){a[i] = Math.random();}
> 
> while(true){
>     [].forEach.call(a, f);
> }
> -----
> Without static analysis, the first array is allocated and this will run the GC. With static analysis, the GC has no reason to run: the first array does not need to be allocated since its reference is never used anywhere after the retrieval of forEach (which is looked up directly on Array.prototype if the implementation is conformant to ES5.1).

So, lift the [].forEach out of the loop.  Ideally, implementations will do this for you.  But, I don't see how this advances any useful discussion about the utility of objects.  In fact, this loop, with a good GC should have very fast GCs when they are triggered. This is because it isn't allocating anything that remains alive beyond a single iteration of the loop.  When the allocation zone fills up the GC starts ups traces roots, finds only a single object that needs to service that cycle, copies it, and resets.
> 
> 
> I'll take actual garbage as a metaphor, I am pro recycling (garbage collection), but to recycling, I prefer to avoid buying things with excessive packaging. This way I produce less garbage (less allocation). Maybe should we apply basics of ecology to memory management? ;-)

You also have to trade-off the runtime cost of doing the data collection and analysis to enable to you eliminate the optimization.  It isn't clear that it will always be cheaper than then just letting a good GC do its job.
> 
> I agree with you that abstractions are a good thing and I won't compromise them if they are necessary. But it should not be an excuse to allocate for no reason, even if it's cheap. And while garbage collection should be improved, if we can find cheap ways to allocate less (at the engine or programmer level), we should apply them.

The starting point of this discussion, is that I content that that is are good reasons to want to abstract over reflection functions using object based mirrors.  The object serve a useful purpose.

Allen



More information about the es-discuss mailing list