[rust-dev] 2 possible simplifications: reverse application, records as arguments

gasche gasche.dylc at gmail.com
Sat Apr 21 10:28:52 PDT 2012


I've been wondering about a problem tightly related to named
parameters: named enum constructor arguments. SML has had the ability
to define algebraic datatypes as sum of (named) records, and I think
that is something that is missing in current rust. The code
antipattern that cries for it is pattern matching for a enumeration
name with a tedious number of "_" following it:

  rust/src % grep "_, _" -R . | wc -l
  547

See also the following code example:

  alt it.node {
    ast::item_impl(tps, _, _, _) {
      if ns == ns_type { ret lookup_in_ty_params(e, name, tps); }
    }
    ast::item_enum(_, tps, _) | ast::item_ty(_, tps, _) {
      if ns == ns_type { ret lookup_in_ty_params(e, name, tps); }
    }
  ... }

The position of the 'tps' variable among a variable number of ignored
parameters is fragile, increase maintainance costs (if you add
a parameter to some constructor in an enumeration, a lot of code
changes are required just to say that you ignore it most of the time),
and creates redundancy.

This could be solved by having enum constructors with named arguments,
and conversely a syntax for enum patterns matching named
argument. Being able to write something like:

  alt it.node {
    | ast::item_impl(tps:tps)
    | ast::item_enum(tps:tps)
    | ast::item_ty(tps:tps) {
      if ns == ns_type { ret lookup_in_ty_params(e, name, tps); }
    }

would be a win (then you can say that "tps:" is a shorthand for
"tps:tps" or what not).

Note: OCaml has made the choice that (K _) is a pattern that matches
the constructor K no matter what its arity is (including none or
several parameters). While that doesn't help with the 'tps' example
above, it would still simplify a lot of places and does not require
named constructor arguments. But I think that's an inferior solution.

>  - Our records are order-sensitive, to be C-structure compatible.
>    Keyword arguments are usually argued-for (as you are doing here) as
>    a way to make function arguments order-insensitive. We'd need to
>    decide whether we wanted order to matter or not.
>
>  - Argument-passing tends to work best when you can pass stuff in
>    registers, not require the arguments to be spilled to memory and
>    addressable as a contiguous slab. So we'd want to be careful not to
>    require the "arguments structure" to be _actually_ addressable as a
>    structure at any point. Rather, calling f(x) would be some kind of
>    semantic sugar for `f(x.a, x.b, x.c)`, making separate copies for
>    the sake of passing.

The clean solution is to make a distinction between the data structure
you call "records" here and the structure that is denoted by the
parameter-building syntax. Haskell has a semantic distinction between
"boxed tuples" (the usual thing) and "unboxed tuples" (primitive, less
exposed, can't be used to instantiate polymorphic types). Similarly
you would have "data records" (contiguous, C-compatible, etc.) and,
say, "native records", that would be a different type with less
flexibility for the user and more flexibility for the implementer: not
adressable, possibly non-contiguous memory layout, a field order
decided by the compiler, etc.

You could then explain {x:1, y:2} as syntaxic sugard for, say,
record(x:1, y:2), where `record` is a polytypic primitive that builds
a "data record" from some "native record".

(You could do the same for tuples and handle mixed named/unnamed
parameters by adopting the convention existing in some languages, for
example Oz, that a tuple (x,y,z) is just a record of numeric fields
(0:x, 1:y, 2:z))

>  - We'd have to decide the rules surrounding keyword mismatches, partial
>    provision of keywords, argument permutation, and function types.

Re. function types: if you consider those parameter-passing structures
as "first class" (which does necessarily mean that they are convenient
to use, for example if they're not adressable they will be
less flexible), the natural choice is to have a family of types for
them. Those types could come with restrictions and an unspoken kinding
discipline, so that for example they cannot be used to instantiate
type variables, maybe cannot be nested, etc.

That's the main reason why I think one should think of such structures
as real structures rather than syntactic sugar; it forces you to have
a proper design for types and other aspects.


> > another thing is that instead of passing arguments, you pass just one
> > (anonymous) record. the record is the arguments.
>
> We actually had quite an argument with one of the Felix authors about
> this. This was not, back then, a terribly realistic option during that
> conversation (argument modes were still the primary way we were doing
> safe references, which are not first class types). But it's conceivably
> something we could look into if we get the argument-passing logic down
> to "always by-value and use region pointers for safe references" (which
> is where we're going). There remain some hitches:
>
>  - Our syntax isn't quite compatible with the idea; records have to be
>    brace-parenthesized and tuples given round parentheses. They'd need
>    reform, and the syntax is already pretty crowded.
>
>  - We'd have to decide the rules surrounding keyword mismatches, partial
>    provision of keywords, argument permutation, and function types.
>
>  - Our records are order-sensitive, to be C-structure compatible.
>    Keyword arguments are usually argued-for (as you are doing here) as
>    a way to make function arguments order-insensitive. We'd need to
>    decide whether we wanted order to matter or not.
>
>  - Argument-passing tends to work best when you can pass stuff in
>    registers, not require the arguments to be spilled to memory and
>    addressable as a contiguous slab. So we'd want to be careful not to
>    require the "arguments structure" to be _actually_ addressable as a
>    structure at any point. Rather, calling f(x) would be some kind of
>    semantic sugar for `f(x.a, x.b, x.c)`, making separate copies for
>    the sake of passing.
>
> So .. I can see a possibility here, but it'd be a complicated set of
> issues to work through. Would need some serious design work. I've never
> been intrinsically opposed to it, just felt that we were constrained by
> other choices in the language. At the time, argument modes were
> completely prohibitive; now it might be possible, but is still not
> entirely straightforward.
>
> -Graydon


More information about the Rust-dev mailing list