Questions/issues regarding generators

Dmitry Lomov dslomov at google.com
Tue Mar 12 12:16:27 PDT 2013


(I'll address comments from both your e-mails here)

On Tue, Mar 12, 2013 at 7:56 AM, Jason Orendorff
<jason.orendorff at gmail.com>wrote:

> On Tue, Mar 12, 2013 at 1:06 AM, Dmitry Lomov <dslomov at google.com> wrote:
>
>> At a risk of repeating myself, 'open()' scenario is handled perfectly
>> well with the iterable (see my example). Example where things truly cannot
>> be reiterated (I am not sure why network stream is such an example - the
>> network connection can always be opened twice) are rare. One possibility
>> will be to throw on the second call to iterator().
>
>
> Gosh, maybe we are irreconcilable then. Implicitly opening network
> connections many times seems bad to me. Same for reopening files, actually.
>

I do not think we are irreconcilable. Clearly there is a library design
choice here. A designer of a particular library for file/network IO may or
may not consider opening a file on 'iterator()' call too implicit. I think
it is not too implicit, while you appear to think otherwise.

In the world with Iterables, the library designer can easily disallow
iterating the result of open a second time - as I suggested above, if for
whatever reason the sequence cannot be re-iterated, iterator() method can
throw on second call. In that case, attempt to zip a file with itself will
throw when zip calls the iterator method a second time, and that will be an
early failure with a clear cause.

However, non-reiterable iterables are a fringe case - maybe 10% of
iterators are non-re-iterable even by the standards you suggest (expensive
operations on iteration start). [I am being generous here; seems that all
allegedly non-reiterable examples suggested so far has been related to I/O;
given that I/O libraries are generally asynchronous in ES, I/O is generally
not very amenable to be implemented as iterators, since in general results
of I/O operations are only available in a callback, and not on-demand, as
next() method would require]. My educated guess would be that 90%
iterators/iterables in the wild would be easily re-iterable, as they would
be results of operations over collections (such as filter, map, zip
and similar). This is a baby that gets thrown with the water, not the
"non-restartable" iterators


> This semantics is sound and consistent, but there is a problem: by that
> logic, the first call 'zip(rangeAsArray, rangeAsArray)' also has all
> the appearances of a user error! It requires careful analysis and thinking
> to convince oneself that it is indeed correct. Well, maybe not in a simple
> case when the initializer of rangeAsArray is an array literal, but as soon
> as the initializer is more complicated - say an external function, you can
> never be sure.
>

> But you could argue the same way for literally any other operation on an
object. 'rangeAsArray.length', for example, would also be nonsensical if
rangeAsArray turns out to be some other sort of object and not an array
after all.

We do not talk here about arbitrary operations on a random object; we are
talking about operations mandated by the language and their semantics. In
fact, length is not a bad example of a precedent in this space: after ES5
   for (int i = 0; i < obj.length; i++) console.log(obj[i]);
works great for all "array-like" data structures, including arrays, strings
and typed arrays. It will be nice to achieve the same for iterator(),
for..of and generators.

> Note that generators return coroutine objects with methods other than
just the iteration-related methods. The coroutine state, to my mind, is
inherent to the returned object.

In the Iterable design, coroutine state would be inherent to a result of
iterator(), i.e. co-routine execution begins once iterator() is called.


> If we are to presume that this particular kind of bug will be common in
> JS, why isn't it common in Python?
> If I'm mistaken about Python and this is actually a common problem there,
> then I'd reconsider.
>

I am not a deep specialist in Python, but my understanding is that the
problem there is mitigated by the common practice of writing iterators.
Python is class-based, so typically one iterates over the class instance,
and implementation of __iter__ looks like:

class MyToDoList:
   ...
   def __iter__(self):
      for task in self.tasks:
         if not task.done:
            yield task
   ...

What happens here is that MyToDoList is actually Iterable in the sense I
advocate: every call to MyToDoList.__iter__ returns a fresh iterator. Since
python developers typically wrap their iterators in a class,
iterable/iterator dichotomy is not acute (but search for "python iterators
vs iterables" and even "python iterators considered harmful" to see some
examples of confused users)

I think that in ES, heavy on functions, people will tend to just use
"function*() { ... }" way more often than in Python.


Dmitry


>
> -j
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130312/b7d47b2b/attachment.html>


More information about the es-discuss mailing list