[rust-dev] Syntax of vectors, slices, etc

Rick Richardson rick.richardson at gmail.com
Mon Apr 23 12:58:33 PDT 2012

I like your suggestion about having / always specify bound.  However,
syntactically decoupling the size of the type from the type seems odd.
 Also, I'm not sure how one would express multidimensional arrays.

My line of thinking is this:
If you have an array of four Ints, the type of the array is 4 Ints.
It's incompatible with an array that is 5 ints without an adaptor.
Since [] indicates a vector, and a vector in Rust has both a length
and a type, Why not make the size of the vector the first of two
parameters in the []?

e.g.     [N,T]

or, for unspecified length:  [_,T]  (possibly sugared to [T])

For multidimensional arrays:

let matrix : [4, [4, Int]]

Then, not that this is a big deal, but no backtracking required, and
it's a bit more in line with existing paradigms.

If I had to order my preferences, I would still prefer the   :N [T]
over  [T]/N  because I think [T]/N could be rather misleading to the
uninitiated.  Although I think I would end up expressing it as :
 let foo : N[T]
But I also express my ptrs in C as
foo* varname instead of  foo *varname, because I think the latter is

On Thu, Apr 19, 2012 at 10:22 AM, Niko Matsakis <niko at alum.mit.edu> wrote:
> In general I love Graydon's proposal for strings and arrays, but I am not
> crazy about the notation.  In particular I think []/@ and []/~ is not a good
> syntax for shared/unique vectors.  It's not the slash, it's that I find it
> inconsistent.  Generally speaking, a @ or ~ after the main type is a bound,
> and before it indicates the kind of the pointer.  But here, it indicates the
> kind of pointer.  And []/3 is not a pointer at all.
> In Graydon's proposal, there are three kinds of vector-like things:
> - Fixed-length arrays ([T]/3, T[3] at runtime)
> - Vectors ([T]/@, [T]/~, boxed<rust_vec<T>>* or rust_vec<T>* at runtime)
> - Slices ([T] or [T]/&, pair of T* and length)
> Of these, the notation for slices seems exactly right: it is short and the
> "/" suffix indicates a bound.  In fact, I think maybe we should change fn@()
> to fn/@() and so forth, and just have "/" be a trailing bound indicator.
>  That leaves fixed-length arrays and vectors to represent somehow.  And
> let's not forget strings, which just complicate everything.
> So here is my overall proposal (best viewed in fixed width).  The comparison
> is between my proposal, Graydon's proposal, and an English-language
> description.  In some cases (such as ifaces), I have also integrated work on
> the type system I would like to do in the future.
>    New type      Old type     Descr.
>    --------      --------     ------
>    fn(S) -> T    fn(S)
>    fn/@(S) -> T  fn@(S) -> T
>    fn/~(S) -> T  fn~(S) -> T
>    :N [T]        [T]/N        fixed-length array
>    [N]T          [T]/N        fixed-length array
>    :[T]          N/A          (see below)
>    @:[T]         [T]/@        boxed vec
>    ~:[T]         [T]/~        unique vec
>    [T]           [T]          slice
>    [T]/&r        [T]/&r       slice with expl. region
>    Id            Id           enum/class/resource/iface
>    Id/&r         Id&r         ...with expl. region bound
>    Id/@          Id@          iface with @ bound
>    Id/~          Id~          iface with ~ bound
>    str           str          slice
>    str/&r        str/&r       slice with expl. region
>    :N str        str/N        fixed-length str
>    :str          N/A          (see below)
>    @:str         str/@        boxed str
>    ~:str         str/~        unique str
> Explanation and rationale:
> - A trailing slash always indicates a bound, meaning that it limits the
> types contained "within" the affected type.  Normally, the bound is a
> region.  In the case of opaque types (like fn and ifaces), this bound can
> also be @ or ~.
> - The type `:N [T]` and `:N str`, corresponds to `T[N]` or `u8[N+1]`
> respectively.  That is, it is a "by-value" array.  If we want to allow N to
> be an arbitrary (const) expression, we may need to write `:(expr) [T]`,
> since `str` is no longer a keyword.
> - Now everything which is in fact a pointer into the task/exchange heaps is
> prefixed with a @/~.
> - The pseudo-type `:[T]` is supposed to look like "an array with an
> unspecified length".  It refers to a rust_vec<T> (by-value).   I say that it
> is a pseudo-type because you cannot write `:[T]` on its own.  In fact, it is
> not even a type.  You can only write `@:[T]` or `~:[T]`---we just use a bit
> of look-ahead.
> The reason to keep `:[T]` from being a type is that it has unknown size.  To
> support this safely with generic types, we'd need to add kinds. I would like
> to do this eventually so that we can declare records with an inline vector
> at the end, but it's not necessary now.
> I am not at all crazy about `:` prefix, I just couldn't come up with a
> better character.  I wanted `#` for number, but (a) it's in use by macros
> and (b) it's kind of heavy.  `*` (think: repeat) is used for unsafe ptrs.
>  `^` is random. `+` (again, repeat) looks like an infix operator, not a
> prefix operator.
> Rejected ideas:
> My original plan was "N:[T]" which I think looks way better than ":N [T]",
> but I scrapped it because `N` might eventually be a const expression and we
> need some clue that it's coming in the parser.
> Another plan which I liked a lot was to have []T be slice, [N]T be constant
> length array, and [:]T or [.]T be  unknown length array.  I think this looks
> *great*, but there are two problems: First, I don't know how it extends to
> `str`.  Second, the region bound, if any, is ambiguous, so you'd need
> parentheses to clear it up: []T/&r could be [](T/&r) or ([]T)/&r.  But maybe
> that's ok as I don't expect explicit region bounds to appear very often at
> all.
> Thoughts?
> Niko
> _______________________________________________
> Rust-dev mailing list
> Rust-dev at mozilla.org
> https://mail.mozilla.org/listinfo/rust-dev

More information about the Rust-dev mailing list