[rust-dev] Language support for external iterators (new for loop)

Timothy Kuehn tkuehn at mozilla.com
Wed Jun 26 12:44:06 PDT 2013


What motivates your suggestion to keep the syntax the same? I think it's confusing that it looks like a closure but is functionally different from one. I remember seeing suggestions to change it to " for pattern in iterator. " Besides being less confusing, I think it simply looks nicer. 


----- Original Message -----

From: "Daniel Micay" <danielmicay at gmail.com> 
To: rust-dev at mozilla.org 
Sent: Tuesday, June 25, 2013 8:44:10 PM 
Subject: [rust-dev] Language support for external iterators (new for loop) 

This is a followup from the previous discussion on internal vs. external 
iterators.[1] 

Since then, most iterators in the standard library have been converted to 
external ones. Almost all uses of the `for` loop are now using the `.advance` 
wrapper, and I don't think there will be any use cases left for the old 
internal iteration protocol. 

# External iterators 

To reiterate the benefits of the external iteration protocol: 

* It's generic and works well with traits, so functions can be written to work 
on any arbitrary `Iterator<A>`. Most adaptors can work for any type `A`, 
whether it's a value, reference or mutable reference. 

* Iteration state is an object, so iterators can be interleaved. This is 
required for a generic zip, merge, union, intersect, etc. and is often useful 
in an ad-hoc fashion to consume only some of an iterator without 
losing it. 

* In the future, Rust can have generators using a `yield` statement like C#, 
compiling down to a fast state machine without requiring context switches, 
virtual functions or even closures. This would eliminate the difficulty of 
coding recursive traversals by-hand with external iterators. 

# Alternatives 

The iteration protocol can't be based on anything requiring virtual method 
calls, heap allocations or context switches without the performance becoming 
significantly worse. 

There have been some suggestions about following the lead of Clojure and using 
reducers[2], but the implementation suffers from the same limitations of not 
having an external state. 

Rust doesn't yet have a way to write data-parallel code, but when it does gain 
that, containers can just support partitioning themselves into ranges via 
`Iterator`. It will work for in-place mutation in parallel too. 

# A new loop 

I think it's a foregone conclusion that we'll be replacing `for`, so I suggest 
that we just reuse the current syntax and change the semantics: 

for iterator |pattern| { body } 

This can just be compiled as the following would be: 

let mut it = iterator; 
loop { 
match it.next() { 
Some(pattern) => { body } 
None => break 
} 
} 

A lang item can be added for the Iterator trait requirement. 

This would avoid the `.advance` tacked onto almost every `for` loop at the 
moment, and rid us of the problems associated with the current `for`: 

* The `break` and `return` statements can fail to work, so borrow/liveness 
checking can't trust them. A loop body can't take an `&mut` reference and 
return it because it could result in grabbing the reference again. This also 
seems to be why we forbid `return` inside closures and do statements, since it 
would be confusing to have to act differently than `for`. 

* The function's local variables are upvars in the closure, so using them is 
very restricted. It's very obvious that it's not just another block because 
of this. 

* Compiling closures is slow, as they have to broken down by SROA and involve 
inlining a function after proving the function pointer is constant. If we 
were marking the function pointers as `llvm.invariant` it might stop being a 
performance hit, but it would remain a compile time issue. 

# Iterables 

The `for` loop above can also be extended to work for *any* `Iterable` in the 
future, not just iterators themselves. 

for iterable |x| { ... } // use the default iterator 
for iterable.rev_iter |x| { ... } // use another iterator 

At the moment, the `Iterable` trait cannot be defined/used because the compiler 
ignores bounds on trait type parameters, but it would be something like the 
following: 

#[lang = "iterable"] 
trait Iterable<A, T: Iterator<A>> { 
fn iter(&self) -> T; 
} 

trait ReverseIterable<A, T: Iterator<A>> { 
fn rev_iter(&self) -> T; 
} 

trait MutableIterable<A, T: Iterator<A>> { 
fn mut_iter(&mut self) -> T; 
} 

trait MutableReverseIterable<A, T: Iterator<A>> { 
fn mut_rev_iter(&mut self) -> T; 
} 

# Links 

[1]: https://mail.mozilla.org/pipermail/rust-dev/2013-June/004364.html 
[2]: https://github.com/mozilla/rust/wiki/Meeting-weekly-2013-06-25#iterators 

_______________________________________________ 
Rust-dev mailing list 
Rust-dev at mozilla.org 
https://mail.mozilla.org/listinfo/rust-dev 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/rust-dev/attachments/20130626/ffb932d5/attachment.html>


More information about the Rust-dev mailing list