[rust-dev] Presentation of our vector/function/trait types

Patrick Walton pwalton at mozilla.com
Sun Jul 15 17:15:18 PDT 2012


Currently, we've gone back on the concept of dynamically-sized types. I 
think that, while there were very good reasons for going back on them in 
the implementation (namely, that we want a pointer to a vector to be a 
fat pointer--one with start and length--internally, among other obscure 
reasons relating to region bounds for functions), they still possess an 
appealing amount of simplicity when describing the language to newcomers.

One of the things that immediately strikes most as complex about Rust is 
the large number of pointer types, vector types, and function types. 
This has not really changed with the current proposals for vector reform 
and function reform. However, the way we present it and think about it 
*can* change, in order to conceptually simplify the language.

Here is what I propose:

1. We have three (or four, if you count unsafe) types of pointers. @T is 
a garbage-collected, task-local pointer, like shared_ptr; it can be used 
anywhere within a single task. ~T is a unique pointer to the exchange 
heap; it can be sent between tasks. &T is a safe reference (borrowed) 
pointer to any allocation.

2. Function types, trait types, and vector types without a fixed size 
are *second-class* types. Because their sizes aren't known to the 
compiler, they can't be referred to in isolation; they can only be 
referred to with a pointer. Any of the three pointer types will do.

Note that this rule is quite similar to C++; note the error that occurs 
when you try to compile this program:

     int foo(int x[]) {
         int y[];
         y = x;
         return y[0];
     }

$ g++ -o test test.cpp
test.cpp: In function ‘int foo(int*)’:
test.cpp:2:8: error: storage size of ‘y’ isn’t known

The only difference is that we prevent the programmer from referring to 
the dynamically-sized type *at all* except through a pointer. In C++ it 
is allowed in some situations, but automatic conversions happen (from 
T[] to T *) to give it meaning. The Rust rule is simpler.

3. A pointer to a vector can be indexed, just like in C++. Unlike C++, 
bounds checks are performed and the task fails if an out-of-bounds 
access is attempted.

4. Again like C++, a pointer can be taken to any element of a vector, 
and this pointer can be indexed just like a pointer to the head of the 
vector. Because a pointer to just one element (which can't be indexed) 
and a pointer to a subrange of the vector (which can be) have different 
types, we need a special method for creating a pointer to a subrange of 
a vector. This method is called "slice":

     let x: &[int] = [ 1, 2, 3, 4, 5 ].slice(3, 5);
     printf!("%d %d", x[0], x[1]);		// prints 4 5

In this way, we can present Rust as a language with three basic pointer 
types, one kind of vector type, one kind of function type (modulo 
unsafe, pure, one-shot if we want to add it, etc), and one kind of trait 
type. I think this could drastically decrease the perceived complexity 
of the language for newcomers without having to change anything in the 
language itself.

Thoughts?

Patrick


More information about the Rust-dev mailing list