[rust-dev] Language support for external iterators (new for loop)

Daniel Micay danielmicay at gmail.com
Tue Jun 25 20:44:10 PDT 2013


This is a followup from the previous discussion on internal vs. external
iterators.[1]

Since then, most iterators in the standard library have been converted to
external ones. Almost all uses of the `for` loop are now using the `.advance`
wrapper, and I don't think there will be any use cases left for the old
internal iteration protocol.

# External iterators

To reiterate the benefits of the external iteration protocol:

* It's generic and works well with traits, so functions can be written to work
  on any arbitrary `Iterator<A>`. Most adaptors can work for any type `A`,
  whether it's a value, reference or mutable reference.

* Iteration state is an object, so iterators can be interleaved. This is
  required for a generic zip, merge, union, intersect, etc. and is often useful
  in an ad-hoc fashion to consume only some of an iterator without
  losing it.

* In the future, Rust can have generators using a `yield` statement like C#,
  compiling down to a fast state machine without requiring context switches,
  virtual functions or even closures. This would eliminate the difficulty of
  coding recursive traversals by-hand with external iterators.

# Alternatives

The iteration protocol can't be based on anything requiring virtual method
calls, heap allocations or context switches without the performance becoming
significantly worse.

There have been some suggestions about following the lead of Clojure and using
reducers[2], but the implementation suffers from the same limitations of not
having an external state.

Rust doesn't yet have a way to write data-parallel code, but when it does gain
that, containers can just support partitioning themselves into ranges via
`Iterator`. It will work for in-place mutation in parallel too.

# A new loop

I think it's a foregone conclusion that we'll be replacing `for`, so I suggest
that we just reuse the current syntax and change the semantics:

    for iterator |pattern| { body }

This can just be compiled as the following would be:

    let mut it = iterator;
    loop {
        match it.next() {
            Some(pattern) => { body }
            None => break
        }
    }

A lang item can be added for the Iterator trait requirement.

This would avoid the `.advance` tacked onto almost every `for` loop at the
moment, and rid us of the problems associated with the current `for`:

* The `break` and `return` statements can fail to work, so borrow/liveness
  checking can't trust them. A loop body can't take an `&mut` reference and
  return it because it could result in grabbing the reference again. This also
  seems to be why we forbid `return` inside closures and do statements, since it
  would be confusing to have to act differently than `for`.

* The function's local variables are upvars in the closure, so using them is
  very restricted. It's very obvious that it's not just another block because
  of this.

* Compiling closures is slow, as they have to broken down by SROA and involve
  inlining a function after proving the function pointer is constant. If we
  were marking the function pointers as `llvm.invariant` it might stop being a
  performance hit, but it would remain a compile time issue.

# Iterables

The `for` loop above can also be extended to work for *any* `Iterable` in the
future, not just iterators themselves.

    for iterable |x| { ... } // use the default iterator
    for iterable.rev_iter |x| { ... } // use another iterator

At the moment, the `Iterable` trait cannot be defined/used because the compiler
ignores bounds on trait type parameters, but it would be something like the
following:

    #[lang = "iterable"]
    trait Iterable<A, T: Iterator<A>> {
        fn iter(&self) -> T;
    }

    trait ReverseIterable<A, T: Iterator<A>> {
        fn rev_iter(&self) -> T;
    }

    trait MutableIterable<A, T: Iterator<A>> {
        fn mut_iter(&mut self) -> T;
    }

    trait MutableReverseIterable<A, T: Iterator<A>> {
        fn mut_rev_iter(&mut self) -> T;
    }

# Links

[1]: https://mail.mozilla.org/pipermail/rust-dev/2013-June/004364.html
[2]: https://github.com/mozilla/rust/wiki/Meeting-weekly-2013-06-25#iterators
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://mail.mozilla.org/pipermail/rust-dev/attachments/20130625/ff30d31f/attachment.sig>


More information about the Rust-dev mailing list