Questions/issues regarding generators

Andreas Rossberg rossberg at google.com
Thu Mar 7 07:37:16 PST 2013


We have started investigating the implementation of generators in V8,
and a couple of questions popped up that are not quite clear from the
proposal (and not yet in the draft spec, AFAICS):

1) Are the methods of a generator object installed as frozen
properties? (I hope so, otherwise it would be more difficult to
aggressively optimise generators.)

2) Is yield* supposed to allow arguments that are not native generator objects?

3) What happens if a generator function terminates with an exception?
According to the proposal, nothing special. That implies that the
generator is not closed. What happens when it is resumed afterwards?
Moreover, is a StopIteration exception handled specially in this
context?

4) Nit: can we perhaps rename the generator "send" method to "resume"?
That is so much more intuitive and suggestive, Python precedence
notwithstanding. :)


Apart from these questions, we also see a couple of issues with some
aspects of the proposal. My apologies if the specific points below
have already been made in earlier discussions (I could not find any
mention).


- The generator/iterable/iterator separation is somewhat incoherent.
In particular, it makes no sense that it is a suitable implementation
of an .iterator method to just return 'this', as it does for
generators. The implicit contract of the .iterator method should be
that it returns a _fresh_ iterator, otherwise many abstractions over
iterables can't reliably work. As a simple example, consider:

  // zip : (iterable, iterable) -> iterable
  function zip(iterable1, iterable2) {
    let it1 = iterable1.iterator()
    let it2 = iterable2.iterator()
    let result = []
    try {
      while (true) result.push([it1.next(), it2.next()])
    } catch(e) {
      if (isStopIteration(e)) return result
      throw e
    }
  }

You would expect that for any pair of iterables, zip creates an array
that pairs the values of both. But is a generator object a proper
iterable? No. It has an .iterator method alright, but it does not meet
the aforementioned contract! Consider:

  let rangeAsArray = [1, 2, 3, 4]
  let dup = zip(rangeAsArray, rangeAsArray)  // [[1,1], [2,2], [3,3], [4,4]]

and contrast with:

  function* enum(from, to) { for (let i = from; i <= to; ++i) yield i }

  let rangeAsGenerator = enum(1, 4)
  let dup = zip(rangeAsGenerator, rangeAsGenerator)  // Oops!

Although a generator supposedly is an iterable, the second zip will
fail to produce the desired result, and returns garbage instead.

The problem boils down to the question whether a generator function
should return an iterable or an iterator. The current semantics
(inherited from Python) tries to side-step the question by answering:
"um, both". But as the example demonstrates, that is not a coherent
answer.

The only way to fix this seems to be the following: a call to a
generator function should NOT return a generator object directly.
Rather, it returns a simple iterable, whose iterator method then
constructs an actual generator object -- and multiple calls construct
multiple objects. In the common case of the for-of loop, VMs should
have no problem optimising away the intermediate object. In the
remaining cases, where the result of a generator function is used in a
first-class manner, the object actually ensures the right semantics.


- Finally, at the risk of annoying Brendan ;), I think we should
(again) revisit the decision to use an exception to mark
end-of-iteration. Besides the usual reservations and the problems
already discussed in earlier threads, it has some rather ugly
implications that I cannot remember being mentioned before:

  * It allows a function _called from_ a generator to fake a regular
"return" _from its caller_ (i.e. the generator):

    function f() { throw StopIteration }

    function* g() { ... f(); ... }

    That's a bug, not a feature. Also, the proposal does not say what
this does to the generator state (see Q3 above).

  * Worse, the semantics as given in the proposal allows _aborting_ a
generator's own return. Not only that, doing this can actually
_revive_ a generator that just got closed:

    function*() {
      ...
      try {
        return;   // closes the generator
      } catch(e) {
        yield 5;  // succeeds!
      }
      ...  // generation can continue regularly after this point

    There can hardly be a question that such a state transition from
'closed' back to 'suspended' should not be possible.

  * Old news: exceptions make it harder to optimise generators,
especially because the compiler cannot generally know all
quasi-regular return points (see above).

In summary, a return statement does not necessarily cause returning,
and returning is not necessarily caused by a return statement. That
drives the whole notion of the return statement ad absurdum, I think
(besides being a pain to implement). The specific points above can
probably be fixed by throwing extra language into the spec, but I think
it should rather be taken as proof that using exceptions are a
questionable path (with potentially more anomalies down the road).

But, in order to (hopefully) let Brandon calm down a bit, I am NOT making
yet another proposal for a two-method protocol. Instead I propose
simply _delivering_ a sentinel object as end-of-iteration marker
instead of _throwing_ one. The zip function above would then be written as:

  function zip(iterable1, iterable2) {
    let it1 = iterable1.iterator()
    let it2 = iterable2.iterator()
    let result = []
    while (true) {
      let x1 = it1.next(), x2 = it2.next()
      if (isStopIteration(x1) || isStopIteration(x2)) return result
      result.push([x1, x2])
    }
  }

AFAICS, this option maintains the advantages of the current approach
while being much more well-behaved, and we can perfectly well keep
using a StopIteration constructor as in the current proposal. (I fully
expect that this option has been discussed before, but I couldn't find
any related discussion.)

/Andreas


More information about the es-discuss mailing list