[rust-dev] &self/&mut self in traits considered harmful(?)

SiegeLord slabode at aim.com
Wed Jun 11 06:27:21 PDT 2014


First, let me begin with a small discussion about C++ rvalue references. 
As some of you know, they were introduced to C++ in part to solve 
problems like this:

Matrix m;
m.data = {1.0, 2.0, 3.0};
Matrix m2 = m * 2.0 * 5.0 * 10.0;

Before C++11, most implementations the multiplications on the third line 
would create two (unnecessary) temporary copies of the Matrix, causing 
widespread inefficiency if Matrix was large. By using rvalue references 
(see the implementation in this gist: 
https://gist.github.com/SiegeLord/85ced65ab220a3fdc1fc we can reduce the 
number of copies to one. What the C++ does is that the first 
multiplication (* 2.0) creates a copy of the matrix, and the remaining 
multiplications move that copy around.

If you look at the implementation, you'll note how complicated the C++ 
move semantics are compared to Rust's (you have to use std::move 
everywhere, define move-constructors and move-assignment with 
easy-to-get-wrong implementations etc.). Since Rust has simpler move 
semantics, can we do the same thing in Rust?

It turns out we cannot, because the operator overloading in Rust is done 
by overloading a trait with a method that takes self by reference:

pub trait Mul<RHS, Result>
{
     fn mul(&self, rhs: &RHS) -> Result;
}

This means that the crucial step of moving out from the temporary cannot 
be done without complicated alternatives (explained at the end of this 
email). If we define an a multiplication trait that takes self by value, 
however then this is possible and indeed relatively trivial (see 
implementation here: 
https://gist.github.com/SiegeLord/11456760237781442cfe ). This code will 
act just like the C++ did: it will copy during the first move_mul call, 
and then move the temporary around:

let m = Matrix{ data: vec![1.0f32, 2.0, 3.0] };
let m2 = (&m).move_mul(2.0).move_mul(5.0).move_mul(10.0);

So there's nothing in Rust move semantics which prevents this useful 
pattern, and it'd be possible to do that with syntax sugar if the 
operator overload traits did not sabotage it. Pretty much all the 
existing users (e.g. num::BigInt and sebcrozet's nalgebra) of operator 
overloading traits take the inefficient route of creating a temporary 
copy for each operation (see 
https://github.com/mozilla/rust/blob/master/src/libnum/bigint.rs#L283 
and 
https://github.com/sebcrozet/nalgebra/blob/master/src/structs/dmat.rs#L593 
). If the operator overloading traits do not allow you to create 
efficient implementations of BigNums and linear algebra operations, the 
two use cases why you'd even *have* operator overloading as a language 
feature, why even have that feature?

I think this goes beyond just operator overloading, however, as these 
kinds of situations may arise in many other traits. By defining trait 
methods as taking &self and &mut self, we are preventing these useful 
optimizations.

Aside from somewhat more complicated impl's, are there any downsides to 
never using anything but by value 'self' in traits? If not, then I think 
that's what they should be using to allow people to create efficient 
APIs. In fact, this probably should extend to every member generic 
function argument: you should never force the user to tie their hands by 
using a reference. Rust has amazing move semantics, I just don't see 
what is gained by abandoning them whenever you use most traits.

Now, I did say there are complicated alternatives to this. First, you 
actually *can* move out through a borrowed pointer using 
RefCell<Option<T>>. You can see what this looks like here: 
https://gist.github.com/SiegeLord/e09c32b8cf2df72b2422 . I don't know 
how efficient that is, but it is certainly more fragile. With my 
by-value MoveMul implementation, the moves are checked by the 
compiler... in this case, they are not. It's easy to end up with a 
moved-out, dangling Matrix. This is what essentially has to be done, 
however, if you want to preserve the general semantic of the code.

Alternatively, you can use lazy evaluation/expression templates. This is 
the route I take in my linear algebra library. Essentially, each 
operation returns a struct (akin to what happens with many Iterator 
methods) that stores the arguments by reference. When it comes time to 
perform assignment, the chained operations are performed element-wise. 
There are no unnecessary copies and it optimizes well. The problem is 
that its a lot more complicated to implement and it pretty much forces 
you to use interior mutability (just Cell this time) if you don't want a 
crippled API. The latter bit introduces a whole slew of subtle bugs (in 
my opinion they are less common than the ones introduced by RefCell). 
Also, I don't think expression templates are the correct way to wrap, 
e.g., a LAPACK library. I.e. they only work well when you're 
implementing the math yourself which is not ideal for the more 
complicated algorithms. Along the same lines, it is not immediately 
obvious to me how to extend this lazy evaluation idea to something like 
num::BigInt. So far, it seems like lazy evaluation will force dynamic 
dispatch in that case which is a big shame (i.e. you'd store the 
operations in one array, arguments in another and then play them back at 
the assignment time).

So, I think the situation is pretty bad. What can be done to fix it?

-SL


More information about the Rust-dev mailing list