[rust-dev] Requesting information (about globs, macros, subtyping)

Cameron Zwarich zwarich at mozilla.com
Fri Jul 18 01:16:00 PDT 2014


On Jul 16, 2014, at 10:54 AM, Gábor Lehel <glaebhoerl at gmail.com> wrote:

> 3. As far as I'm aware, subtyping in the current language arises only from subtyping of lifetimes. Where is this important? One example was mentioned in [Niko's recent blog post](http://smallcultfollowing.com/babysteps/blog/2014/07/06/implied-bounds/). Where else? Without this subtyping of lifetimes, what would break? How burdensome would it be if one had to use explicit casts in those circumstances?

I’ve been thinking about this a bit recently, so I might as well post my thoughts in this thread rather than following up privately with people.

All subtyping in Rust is ultimately derived from the inclusion of lifetimes and the contravariance of the &/&mut type constructors in their lifetime parameter: a pointer to something that lives longer than ‘a is usable in any place where we are using a pointer to something with lifetime ‘a. As far as I know, there are only 3 original sources of bounds ‘a <= ‘b for lifetimes ‘a and ‘b:

1) Inclusion of concrete lifetimes, i.e. control-flow regions (currently lexical scopes, but soon to be extended to arbitrary single-entry / multiple exit regions), in the same function.

2) That if a function is parameterized in lifetime ‘b and lifetime ‘a is a concrete lifetime in that function, then ‘a <= ‘b.

3) That the scope of a borrow is always included in the lifetime of its referent, e.g. that ‘a <= ‘b in &’a &’b T. This is described by Niko in his blog post http://smallcultfollowing.com/babysteps/blog/2013/04/04/nested-lifetimes/. This rule is different than the first two because it is the only way that a bound can be propagated on two lifetime *parameters*, whereas the first two involve a concrete lifetime in one of the positions.

This is handwaving a bit, since the specifics of 1) and how these all interact in the current lifetime inference (and Niko’s proposed simplification) are all relevant in practice, but I hope this is a correct abstraction of the situation.

The simplest question to ask is whether it is possible to remove lifetime subtyping entirely from Rust. Unfortunately, this is not possible, because if you try to do this (see the Tofte-Talpin region system, which is probably the closest thing to pure Hindley-Milner type inference for a region system) then you have lifetime variables for local variables in a function caller and callee unified, so that a called function is allocating local variables in the caller’s lifetime. This means that lifetimes no longer correspond to nested stack frames, and you also can’t implement destructors properly (at least in any trivial way that I can think of). This is not suitable for a systems language.

The next logical question to ask is whether you can eliminate the non-invariance of type constructors, since that is how subtyping infects the rest of the language.

Since &/&mut are contravariant in their lifetime parameter, the vast majority of type constructors get their variance inferred as contravariant in lifetime parameters. Most examples of contravariance come from something like this:

struct A<‘a> {
    x: &’a int,
    y: &’a int,
}

If I have two distinct &int borrows with concrete lifetimes (meaning an actual control-flow region in the calling function, rather than a lifetime parameter) being used at a construction of A<‘a>, then one of the lifetimes is nested in the other. Hence I if ‘a is the inner lifetime and ‘b is the outer lifetime, I can coerce the &’b to an &’a and construct an A<‘a>.

What do I lose by this? Well, it only works at the first level, so that something like this will fail to type-check:

struct A<'a> { x: &'a int }

fn foo(i: int) {
    let a = A { x: &i };
    if i > 1 {
        let j  = 2;
        let b = if i > 10 {
            a
        } else {
            A { x: &j }
        };
    }
}

There are obvious workarounds here, but in more complex examples they could possibly hurt the readability / performance of the program. However, the fallout is internal to a single function, since there is no way to directly propagate the knowledge that one concrete lifetime is included in another to another function (beside the 3rd source of bounds mentioned above. which I will talk about below).

You can do similar coercions with the sources of bounds 2) and 3). In 2) one of the lifetimes is a concrete lifetime again, so again all of the fallout is internal to a single function. However, in 3) there is a new complication. since knowledge of the relationship between two lifetimes can actually leak outside of the functions where they originate. Here is the example in Niko’s blog post on 3):

struct BorrowedCursor<'b, T> {
    buffer: &'b [T],
    position: uint
}

impl<'b, T> Cursor<T> for BorrowedCursor<'b, T> {
    fn get<'c>(&'c self) -> &'c T {
        &self.buffer[self.position]
    }

    ...
}

This would still work, because we’re dealing directly with the & type constructor, and we could still coerce an &’b [] to an &’c []. However, if you were to add one level of abstraction over the borrowing, you would fail to type-check:

struct A<'a> { x: &'a int }

fn f<'short, 'long>(p: &'short &'long int, a: A<'short>, b: A<'long>) -> A<'short> {
    b
}

These examples indicate that if you get rid of contravariance of type constructors, you are demoting user-defined types to being second-class citizens relative to builtin borrowed pointers. This is always possible to work around by just inlining the definitions of types, but it limits the potential for type abstraction. I am still curious how bad the fallout would be, but there is no easy way to tell besides implementing the change.

The above reasoning only applies if you think that contravariance is the only useful form of variance for lifetime parameters. Is covariance actually useful? The simplest way it can arise is from a borrowed pointer type with a lifetime parameter in a function parameter, e.g.

struct A<'a> { x: |&'a int|:'static -> int }

Functions aren’t really stored into data structures that much currently in Rust (maybe that will change), so there aren’t many good examples here. In theory, nested contravariance can also produce covariance, but there are no type constructors that produce a lifetime and non-lifetime contravariance would only arise via a borrowed pointer as a function parameter, so I think this is subsumed by the previous case.

I wrote a simple patch that eliminates covariance from type checking by forcing all covariant type constructors to be invariant:

https://gist.github.com/zwarich/8d67756efd985fa9b216

This patch actually makes it through all of stage1, only failing on tests that seem to exist only to test covariance (I could be wrong here, I didn’t take a careful look at everything). However, when it hits libsyntax for building stage2 it gets an error:

https://gist.github.com/zwarich/afda62376ccab80dcf19

This seems like it is just a bug, since LinkedPathNode<‘a> should definitely be contravariant (both from the inferred bounds and the fact that a node with a shorter lifetime can point to a node with a longer lifetime, not the other way around), but I don’t have time to see what’s going on. Obviously something is getting inferred as covariant and it’s causing an issue.

Anyways, I’ve rambled on about this for long enough. I probably got something wrong here, but I’d be interested in hearing the thoughts of others.

Cameron


More information about the Rust-dev mailing list