Immediate closing of iterators

Brendan Eich brendan at mozilla.org
Thu Dec 21 16:00:37 PST 2006


On Dec 21, 2006, at 2:43 PM, Chris Hansen wrote:

>> >   Also, a generator doesn't have an iterator::get
>> > method since that would complicate the question of who "owns" it.
>>
>> This is not a problem in the ES4 proposal, or in Python.  Ownership
>> of storage and close are coupled only to guarantee that close happens
>> eventually, even if the client code fails to call gen.close()
>> explicitly.  More below.
>
> By the "owner" of a generator I meant the loop responsible for closing
> it, not the storage owner.

I see.  That makes sense in the context of a loop or comprehension,  
but generators are used otherwise. The "Motivation" lead-in from PEP  
342 (http://www.python.org/dev/peps/pep-0342/) says:

[PEP 255 generators, which lacked close] do not allow execution to be  
paused within the "try" portion of try/finally blocks, and therefore  
make it difficult for an aborted coroutine to clean up after itself.

Automating close only for generators iterated by for-in loops leaves  
non-loop use cases that nevertheless need to clean up after  
themselves out in the cold.

>   But maybe there should be an iterator::get
> method on generators

I think we should stick to the Pythonic rule that iterators return  
themselves from their iterator::get method (Python calls it __iter__  
but we can avoid __ugly__ names with less::ugly namespacing).

> -- people just have to be aware that loops close
> generators, which might be confusing.

Yes.  A Python hacker moving to ES4 might be outraged by any  
deviation, but we've already done away with GeneratorExit in favor of  
a forced return (which PEP 325 suggested as one of two approaches,  
not the one PEP 342 chose; python-dev interactions have led to both  
PEP 342 authors favoring the forced-return approach in a future  
version of Python).  I've argued that we should avoid gratuitous  
differences with Python, given that we are specifying Python-like  
generators and not threads, call/cc, or general coroutines for ES4.  
But clearly we can afford to diverge, or even try to anticipate, at  
the boundary cases.

OTOH, it sounds like C# (and perhaps IronPython?  How would it do  
otherwise given the single GC ruling the CLR?) may set the "close on  
loop exit" expectations for another segment of hackers.

We can document this and stick to it, my gut says.

>> > That only "solves" Jeff's problem by disallowing it (ta-daa! ;-)  
>> but
>> > it does away with the need for  any kind of finalization, prompt or
>> > not.  In my experience (from java) GC finalization is something you
>> > want to steer well clear of.
>>
>> Finalization is definitely two-phase in systems that have to support
>> close (which might resurrect the generator) and then release its
>> storage.  Those of us burdened with GC-based memory management for  
>> ES/
>> JS/AS implementations have to dance with the GC here.
>
> I see, you would still need finalization to guarantee that generators
> not created by loops are eventually closed.  But is that a guarantee
> you actually need to make -- especially if it complicates the
> implementation and potentially opens the browser up for a new type of
> DOS attacks?

As Gosling remarked over a decade ago, "DOS attacks are a dime a  
dozen" (my paraphrase ;-).

The DOS hazard is like setTimeout, and we've dealt with it for  
Firefox 2.  We don't bother closing any generator-iterator whose  
static scope parent (the window in whose scope the generator was  
constructed) is unmarked after the GC's mark phase.  In this case,  
both the generator and its window object are gone (just not yet  
finalized).  If you reparent a generator from window A to window B,  
by creating it in a script loaded in A, storing it in B, unloading or  
closing window A, and then unloading or closing B without manually  
calling the generator's close, again the system won't call close for  
you.

Any long-lived generator whose window affinity changes will have to  
be manually closed.

But consider the general case, outside of the cramped world of web  
page scripts: XUL apps and extensions, or other SpiderMonkey  
embeddings, will use generators in various ways, and not be subject  
to DOS attacks.  These embeddings expose stateful APIs to trusted  
code, and failing to call close may leak an OS resource or fail to  
synchronize important state.

>   C# doesn't guarantee this.  In java, even though you're
> guaranteed that finalizers will be run, they advise people not to rely
> on them.  Instead, it could just be part of the contract on Generator:
> if you create it, you have to close it (unless you know that close is
> a no-op).  It's not something that people are likely to do often and I
> think they will close explicitly anyway rather than rely on
> finalization, which adds a source of nondeterminism to a program.
> Especially if the browser might actually cancel close ops.

Agreed on the wisdom of untimely cleanup from finalization.

However, the try/finally issue still makes us want to follow Python,  
for general purposes (not just browser purposes).  Lacking a DOS  
threat, and assuming the generator is written correctly, we convinced  
ourselves this summer that finally clauses should run from close  
after the last yield in a try has returned a value, and the caller or  
the GC is done with the generator.  If the caller forgets to close,  
finally still should run (says the ES3 spec in all of the cases it  
defines).

You could argue that breaking finally to avoid DOS attacks means no  
one can count on it in a generator that yields from its try block,  
but I think that overstates the case.

We tried that out on some Pythonistas with an earlier variation:  
outer finally should run even if the generator yields when called  
from close.  The response was "the generator's broken, it should get  
an exception immediately rather than a chance to run outer finally  
blocks."  Python is not totally consistent here, since an explicit  
close that yields will fail, aborting the close, and the GC will  
retry, so the misbehaving generator will get another chance to run.   
In Firefox 2, a bad generator of the form

function badsanta() {
   try {
     yield "rock";
   } finally {
     try {
       yield "barf";
     } finally {
       print("closing");
     }
   }
}
var it = badsanta();
it.next();
it.close();

will throw from the 'yield "barf"', and print "closing" from the  
inner finally.  We say if you're bad, you still get finally  
guarantees. :-)

You're right that "timely release" means "don't count on the GC", and  
we've seen bad embeddings of SpiderMonkey go down the path that Java  
warns against, time after time.  We're not trying to facilitate such  
bad programming.  But we do think finally should work in generators  
even when the user forgets to close, in the absence of DOS threats.

> Also, having finalization will mandate a non-conservative GC.

Why?

/be




More information about the Es4-discuss mailing list