[rust-dev] RFC: Removing *T

Gábor Lehel illissius at gmail.com
Sat Jul 27 07:51:38 PDT 2013


Spurred by https://github.com/mozilla/rust/issues/7694 I was thinking about
the differences between `*` and `&` and the reason both need to exist.

As far as I can tell the differences boil down to:

  - The compiler makes no attempt to enforce any invariants for `*T`

Meaning that, in an interesting duality-ish:

 - Creating `*T`  out of `&T`  (or anything else) and manipulating it in
any which way is always safe
 - Dereferencing `*T` is unsafe
 - Creating `&T` out of `*T` is unsafe
 - Dereferencing `&T` (and whatever else the language lets you do with it)
is safe

Behind it is proof obligations. `*T` has no implied invariants and
therefore doesn't require proof of anything, while as long as you stick to
`&T`, safety is proved by the compiler. It's at the boundary where the
burden is on the programmer: to assert (with an `unsafe` block) that the
invariants required for dereferencing `*T` and/or converting it to `&T`
really do hold.

The use case for `*T` is operations which are not known to respect
invariants: notably foreign functions, also e.g. pointer arithmetic.

The invariants required of `&T` but not `*T` are:

  1. The pointer is not null
  2. The pointed-to object is of type `T`
  3. The pointed-to object is alive and initialized
  4. (Im)mutability and aliasing related invariants

The latter three of which are guaranteed for the lifetime associated with
the pointer.

Now crazy ideas:

We can waive the first invariant by using `Option`. If we could guarantee
in the language that `None : Option<&T>` is represented as a null pointer,
then I see no reason whatsoever to keep allowing implicit nullability. It's
binary compatible with C, so (except where can't-be-null is known) C
interfaces would simply use Option. It forces proper attention to nulls,
and I don't actually see any drawback.

We can waive the other invariants by taking advantage of the fact that
they're predicated on a lifetime, to introduce a new special lifetime
`'unsafe`, which is the inverse of `'static`. Where `'static` is always
known to be alive, `'unsafe` never is. (So `'static` is top and `'unsafe`
is bottom.) Therefore converting `&'a T` to `&'unsafe T` is always allowed,
while if you have an `&'unsafe T` and want to convert it to a pointer with
a longer lifetime and dereference it, you have to use `unsafe { }`,  just
as with `*T`. Functions parameterized over lifetime variables would
continue to require that the lifetime encompass the duration of the call,
so if you want to allow `&'unsafe T` as an argument, you have to write that
explicitly (again as with `*T`).

One question is whether you might want to waive 2-4. with finer
granularity:

- You could waive only 2. by using `&()`. It's not clear to me if it makes
sense to talk about a definitely-live-and-initialized(-and-immutable) value
of unknown type, but if there's a use case for it, it can be satisfied.

- I don't think it makes sense to waive only 3.: you can't say a dead or
uninitialized value is of type T, because it could be *anything* (which is
why `'unsafe` above, conceptually most closely tied to 3., also waives 2.
and 4.).

- It might make sense to waive only 4.: you might care only that a value is
alive and of type T, not whether anyone else is mutating it: this is
`&const T` (which is hoped to be removed). You might also want to mutate it
while not caring whether anyone else is also reading or writing: there's no
separate mutability qualifier for this, a use case might be e.g. atomic
operations. But even in the absence of any special features, these can
still be satisfied by using `&'unsafe [mut] T`, which is an
overapproximation, just as with `*[mut] T` today.

(Corresponding to the separate invariants, it might make sense to have
separate functions for unsafely manipulating the pointed-to type, lifetime,
and mutability of a borrowed pointer, instead of just the catch-all
`transmute()`, similarly to C++.)

Summary:

I think we could replace `*` pointers with a combination of guaranteeing
the null pointer optimization for `Option` (kinda like TCO) and introducing
a new special lifetime `'unsafe`. This would retain the desirable
properties of the current system, while being superior in other ways, such
as orthogonality. Whether and how to waive aliasing and mutability related
invariants separately from the others might need more thinking, but is
independent.

Thoughts?

-- 
Your ship was destroyed in a monadic eruption.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/rust-dev/attachments/20130727/bdf8e8ec/attachment.html>


More information about the Rust-dev mailing list