[rust-dev] sub-grammar for range pattern constants?
graydon at mozilla.com
Tue Apr 30 11:38:49 PDT 2013
On 30/04/2013 10:08 AM, Max Cantor wrote:
> I know this will be an unpopular opinion, but pure functions would be a
> massive win for Rust, especially if the eventual goal is high
> performance, highly parallelizable (browser rendering engines..)
Careful. It's important to understand that "purity" seems like it has a
simple definition but in languages with mutable memory, state, time and
IO, it gets hard to be exact.
Things you can throw at a SIMD unit or GPU and get parallel kernels out
of them will almost certainly be a different version of "pure" than
things you can evaluate at compile time. Or, as in our previous attempts
at defining it, things that won't break existing borrows or existing
typestates. Each of these is a static approximation of the
set-of-all-things-a-function-might-do. Since our functions can generally
do quite a lot, the set of possible subsets you might mean by "pure" is
correspondingly much larger.
> The typestate system did seme very complex but isn't there a middle
> ground via annotations perhaps? A subset of primitives and core
> functions can be annotated as pure and then any function calling only
> pure functions can itself be annotated as pure.
This gets difficult fast. You wind up dividing your functions up into
groups and then getting annoyed that one that's "mostly almost pure" or
"essentially pure, for my purposes" that you wanted to call actually
isn't (someone forgot to mark it as such, or some sub-function, or some
trait-implied function) and then you can't. Or is pure using one way of
thinking about purity, but not another. Or is pure except for the bit
where it calls unsafe but promises it's going to maintain purity, just
knows better than you (oops, that can't be done at compile time, nor on
a GPU, etc.)
C++ has multiple concepts for this, each a not-entirely-obvious subset
of the others, each affecting the others, and causing quite a lot of
work just to get things to compile, much less reuse code.
They have const methods (don't mutate the object, unless you lie and
override it) and constexpr (can be evaluated at compile time), and
macros (can only operate on tokens), and template evaluation (can only
operate on certain non-type args), and the openCL __kernel extension for
"Which purity do you mean" is a very real question, not one you can just
brush aside. The combinations are worse, in that they tend to cause (or
require) subtyping relationships, that touch _everything_ in the
language, from inference and mandatory syntax (which types get inferred
when you just write a lambda?) to type checking (which calls are legal,
which impls and delegations are legal) to codegen (which LLVM attributes
are legal? which things can we inline and how?)
> pure functions that cannot be expressed this way, but using annotations
> and percolating effect tracking (similar to mutable fields in structs)
> seems like it shouldn't be too complex.
"It shouldn't be too complex" is sadly optimistic, and your analogy is
telling: we removed mutable fields in structs as well. They added too
much cognitive complexity when combined with inherited mutability via
mutable owners (which people expect to "override" field mutability --
and rightly so).
A long time ago we had an effect system and we made pure the default
(since we didn't want people accidentally leaving it out due to sloth)
and we made the impure specifier a very small and reasonable keyword:
"io". It was still a heavy complexity bill (required a full extra
dimension of subtyping, parametricity, etc.) and _still_ had people
breaking the rule with `unsafe`, which meant that the putative benefits
like "can do compile time evaluation" or "can spread on a GPU" weren't
there anyways. And people couldn't do simple things like "put a printf
in here for logging" (much like in haskell).
Eventually people just took to making everything io, at which point it
was a noise word and we decided to remove it (along with 'pred', which
just meant pure, bool, and tracked by the typestate layer).
> Coming from the Haskell world, not having pure functions would be a
> considerable deficit in the language that is so close to being a best of
> both worlds descendant of C and Haskell.
The "direct descendant" aspect here is probably inaccurate. We're more
of a descendant of ML with its mutable ref cells, eager evaluation and
such. Haskell was willing to force its users to segregate all variants
of impurity outside the most conservative intersection into monad
representation. But this costs heavily in all code that doesn't fit
that intersection, which is a lot of systems code; so we've decided
against it. The split is too much for C programmers to accept; anywhere
we try to draw the line appears to cause a lot of anger and resentment
over mandatory type-system fighting.
 This is why they could "implement STM in a weekend" -- by excluding
almost all functions -- but I think this characterization is really
unfair anyway. What they really have is just "do notation", which means
constructing a suspended execution-tree is _slightly_ less miserable
than a deeply nested tree of constructor calls and lambdas. But not
really a lot. The interface-points with the rest of the language involve
pervasive lifting, lowering, wrapping and unwrapping. See the "simple
STM example" on their website:
More information about the Rust-dev