[rust-dev] RFC: Removing *T

Gábor Lehel illissius at gmail.com
Sun Jul 28 14:06:48 PDT 2013


On Sun, Jul 28, 2013 at 3:34 AM, Erick Tryzelaar
<erick.tryzelaar at gmail.com> wrote:
> Hey Gábor!
>
> This is a neat observation!

Thanks.

> I especially like how Option<&T> represents
> nullable pointers. However, I am not sure if a bottom region as you
> described quite captures the right semantics. While it's neat how it falls
> out naturally that you need to use an unsafe block to dereference a
> `&'unsafe T`, this bottom region also have a shorter lifetime then every
> other region. So this would mean we couldn't safely return these pointers
> without a cast.
>
> Because of this, I think pointers have a couple orthogonal semantics:
>
> 1. Mutablility
> 2. Safety
> 3. Lifetime
>
> If my unsafe region pointers patch lands, we support all these cases.

I think I still don't grok the meaning of lifetimes on *T. Is the
lifetime a necessary or sufficient condition for anything, or is it
just a helpful guardrail?

>
> That said, I still think there is a good argument about whether or not we
> need unsafe pointers in the library or in the language. bstrie brought up
> this idea of moving *T out of the language and into a library, as in an
> Unsafe<T> or RawPtr<T> type. If we implement RawPtr<T> as:
>
> struct RawPtr<T>;
>
> impl<T> RawPtr<T> {
>    unsafe fn get(&self) -> T { intrinsics::get(self) }
>    unsafe fn set(&self, x: T) -> T { intrinsics::set(self, x) }
>     ...
> }
>
> Then I *think* it supports all the semantics we need. If I'm not missing
> anything, I'd rather have this, reject my patch, and remove unsafe pointers
> altogether from the compiler.

I think there's two independent questions.

One is syntax. *T or RawPtr<T>? (@T or Gc<T>?) Syntax sugar isn't
inherently good or bad: it's sweet, but too much can make you feel
bad. We have Lisp at one extreme and Perl at the other. We should
strive to find a pleasant balance. It might be worthwhile to have it
really be "just sugar" by making @T, ~T, &T, and *T be aliases for
distinguished types Gc<T>, Owned<T>, Ref<'a, T>, and Ptr<T> (for
example), regardless of which syntax candies we ultimately keep (I
think I saw something like this in the bug tracker?). (Personally I
lean towards keeping, in part because nested This<That<T>> is
unpleasant, and it's nice to avoid it .)

The other is implementation. In the compiler or as a library? This one
seems less ambiguous: anything that can be done in a library probably
should be. If we can put Gc<T>, Ptr<T> etc. in libraries and have the
compiler just do desugaring (if anything), why not? (I think this
wouldn't be possible for Owned<T>, because it's not parametric in T
(the representation depends on whether T is managed)? Rust doesn't
have anything like C++'s template specialization. Won't all of them
have this issue with DST, with representations depending on whether T
is Sized?)


>
> -Erick
>
>
> On Saturday, July 27, 2013, Gábor Lehel wrote:
>>
>> Some discussion has sprouted on reddit:
>> http://www.reddit.com/r/rust/comments/1j5vbn/rustdev_rfc_removing_t/
>>
>> On Sat, Jul 27, 2013 at 4:51 PM, Gábor Lehel <illissius at gmail.com> wrote:
>>
>> Spurred by https://github.com/mozilla/rust/issues/7694 I was thinking
>> about the differences between `*` and `&` and the reason both need to exist.
>>
>> As far as I can tell the differences boil down to:
>>
>>   - The compiler makes no attempt to enforce any invariants for `*T`
>>
>> Meaning that, in an interesting duality-ish:
>>
>>  - Creating `*T`  out of `&T`  (or anything else) and manipulating it in
>> any which way is always safe
>>  - Dereferencing `*T` is unsafe
>>  - Creating `&T` out of `*T` is unsafe
>>  - Dereferencing `&T` (and whatever else the language lets you do with it)
>> is safe
>>
>> Behind it is proof obligations. `*T` has no implied invariants and
>> therefore doesn't require proof of anything, while as long as you stick to
>> `&T`, safety is proved by the compiler. It's at the boundary where the
>> burden is on the programmer: to assert (with an `unsafe` block) that the
>> invariants required for dereferencing `*T` and/or converting it to `&T`
>> really do hold.
>>
>> The use case for `*T` is operations which are not known to respect
>> invariants: notably foreign functions, also e.g. pointer arithmetic.
>>
>> The invariants required of `&T` but not `*T` are:
>>
>>   1. The pointer is not null
>>   2. The pointed-to object is of type `T`
>>   3. The pointed-to object is alive and initialized
>>   4. (Im)mutability and aliasing related invariants
>>
>> The latter three of which are guaranteed for the lifetime associated with
>> the pointer.
>>
>> Now crazy ideas:
>>
>> We can waive the first invariant by using `Option`. If we could guarantee
>> in the language that `None : Option<&T>` is represented as a null pointer,
>> then I see no reason whatsoever to keep allowing implicit nullability. It's
>> binary compatible with C, so (except where can't-be-null is known) C
>> interfaces would simply use Option. It forces proper attention to nulls, and
>> I don't actually see any drawback.
>>
>> We can waive the other invariants by taking advantage of the fact that
>> they're predicated on a lifetime, to introduce a new special lifetime
>> `'unsafe`, which is the inverse of `'static`. Where `'static` is always
>> known to be alive, `'unsafe` never is. (So `'static` is top and `'unsafe` is
>> bottom.) Therefore converting `&'a T` to `&'unsafe T` is always allowed,
>> while if you have an `&'unsafe T` and want to convert it to a pointer with a
>> longer lifetime and dereference it, you have to use `unsafe { }`,  just as
>> with `*T`. Functions parameterized over lifetime variables would continue to
>> require that the lifetime encompass the duration of the call, so if you want
>> to allow `&'unsafe T` as an argument, you have to write that explicitly
>> (again as with `*T`).
>>
>> One question is whether you might want to waive 2-4. with finer
>> granularity:
>>
>> - You could waive only 2. by using `&()`. It's not clear to me if it makes
>> sense to talk about a definitely-live-and-initialized(-and-immutable) value
>> of unknown type, but if there's a use case for it, it can be satisfied.
>>
>> - I don't think it makes sense to waive only 3.: you can't say a dead or
>> uninitialized value is of type T, because it could be *anything* (which is
>> why `'unsafe` above, conceptually most closely tied to 3., also waives 2.
>> and 4.).
>>
>> - It might make sense to waive only 4.: you might care only that a value
>> is alive and of type T, not whether anyone else is mutating it: this is
>> `&const T` (which is hoped to be removed). You might also want to mutate it
>> while not caring whether anyone else is also reading or writing: there's no
>> separate mutability qualifier for this, a use case might be e.g. atomic
>> operations. But even in the absence of any special features, these can still
>> be satisfied by using `&'unsafe [mut] T`, which is an overapproximation,
>> just as with `*[mut] T` today.
>>
>> (Corresponding to the separate invariants, it might make sense to have
>> separate functions for unsafely manipulating the pointed-to type, lifetime,
>> and mutability of a borrowed pointer, instead of just the catch-all
>> `transmute()`, similarly to C++.)
>>
>> Summary:
>>
>> I think we could replace `*` pointers with a combination of guarantee



-- 
Your ship was destroyed in a monadic eruption.


More information about the Rust-dev mailing list