[rust-dev] RFC: Overloadable dereference operator

Patrick Walton pwalton at mozilla.com
Mon Jul 29 22:28:48 PDT 2013


Hi everyone,

I've recently started thinking that a number of use cases that we've 
wanted to solve at some point could be solved if the dereference 
operator could be overloaded much like the other operators. Most 
importantly, this addresses a missing part of the puzzle for custom 
smart pointers, but it also fixes issues relating to autoderef on 
newtypes and "common fields".

# Mechanics

We introduce a new lang item trait:

     #[lang="deref"]
     pub trait Deref<Result> {
         fn deref(&'self self) -> &'self Result;
     }

This `deref` method is invoked by the compiler in two cases:

1. When the unary `*` operator is used on a value. In this case, the 
result pointer type is automatically dereferenced and becomes an lvalue 
(albeit an immutable one).

2. When method lookup or field projection fails. In this case, the 
method lookup or field projection is tried again with the `Result` type.

It would be nice if `Result` were a functional dependency of `Self` 
above (e.g. simultaneous `impl Result<int> for Foo` and `impl 
Result<float> for Foo` would be forbidden). Unfortunately we don't have 
the trait machinery to enforce this yet, as this is associated types. We 
could just enforce this in an ad hoc way, or we could not enforce it. I 
don't care too much either way.

# Use cases

There are several use cases that this enables:

## Custom smart pointers

For custom smart pointers it is highly desirable to support autoderef 
and to have the `*` operator enable access to fields. For example, 
suppose `@T` becomes `Gc<T>`. We would like to avoid something like:

     let p: Gc<int> = ...;
     do p.read |x| {
         printfln!("Your lucky number is %d", *x)
     }

With overloadable deref it would look like:

     let p: Gc<int> = ...;
     printfln!("Your lucky number is %d", *p)

I *believe* that this does not cause liveness issues for GC and RC 
because the lifetime of the resulting reference is tied to the lifetime 
of the GC/RC box itself, so the reference piggybacks off the pointer's 
reference count and everything is OK. However, I could be mistaken here; 
here I'd like others to check my reasoning. In particular I'm also 
interested in legitimate use cases that this might forbid.

## Controllable newtype autoderef

Currently, newtype structs automatically dereference to the value they 
contain; for example:

     struct MyInt(int);
     fn main() {
         let x = MyInt(3);
         printfln("1 + 2 = " + x.to_str()); // prints "1 + 2 = 3"
     }

This behavior is sometimes undesirable, as Brian often points out. 
Haskell allows behavior similar to this to be controlled on an opt-in 
basis with `GeneralizedNewtypeDeriving`. We could support something 
similar by turning off autoderef for newtype structs and leaning on 
overloadable dereferencing when it is desirable. In this new world, to 
get the behavior above one would write:

     struct MyInt(int);
     impl Deref<int> for MyInt {
         fn deref(&'self self) -> &'self int {
             let MyInt(ref inner) = *self;
             inner
         }
     }

We could imagine something like this to make it simpler:

     #[deriving(Deref)]
     struct MyInt(int);

## Anonymous fields

In Go (and in C with Plan 9 extensions) it is possible to place one 
struct inside another struct and inherit its fields:

     type Foo struct {
         X int
         Y int
     }

     type Bar struct {
         Foo
         Z int
     }

     x = Bar {
         Foo {
             X: 1,
             Y: 2,
         }
         Z: 3,
     }
     fmt.Println("%d", x.Y)  // prints 2

This is almost multiple inheritance, except that the type of the `this` 
pointer will be different when invoking `Foo` methods on a `Bar` instance.

With overloadable deref this would be possible in Rust as well:

     struct Bar {
         base: Foo,
         z: int,
     }

     impl Deref<Foo> for Bar {
         fn deref(&'self self) -> &'self Foo {
             &self.base
         }
     }

One could imagine macro sugar for this use case, for example:

     #[deriving(Deref(base))]
     struct Bar {
         base: Foo,
         z: int,
     }

## Common fields

It is a common pattern, for example in Servo, to simulate inheritance in 
Rust with something like:

     struct Bar {
         base: FooCommon,
         ...
     }

     struct Baz {
         base: FooCommon,
         ...
     }

     struct Boo {
         base: FooCommon,
         ...
     }

     enum Foo {
         BarClass(~Bar),
         BazClass(~Baz),
         BooClass(~Boo),
     }

The problem here is that if you have a `Foo` instance there is no 
convenient way to access the common fields short of a `match`. Again, 
overloadable deref comes to the rescue here. We could imagine an 
overloaded `Deref` as follows:

     impl Deref<FooCommon> for Foo {
         fn deref(&'self self) -> &'self FooCommon {
             match *self {
                 BarClass(ref bar) => &bar.base,
                 BazClass(ref baz) => &baz.base,
                 BooClass(ref boo) => &boo.base,
             }
         }
     }

And once again we could come up with some sort of syntactic sugar for this.

# Conclusion

This one small feature seems to encompass a lot of use cases which we 
had previously thought we might have to solve using multiple disparate 
features. That cleanliness is attractive to me, assuming that this 
scheme works. I'd be interested to hear everyone's thoughts.


More information about the Rust-dev mailing list