[rust-dev] sub-grammar for range pattern constants?

Graydon Hoare graydon at mozilla.com
Tue Apr 30 11:38:49 PDT 2013


On 30/04/2013 10:08 AM, Max Cantor wrote:

> I know this will be an unpopular opinion, but pure functions would be a
> massive win for Rust, especially if the eventual goal is high
> performance, highly parallelizable (browser rendering engines..)
> development.

Careful. It's important to understand that "purity" seems like it has a 
simple definition but in languages with mutable memory, state, time and 
IO, it gets hard to be exact.

Things you can throw at a SIMD unit or GPU and get parallel kernels out 
of them will almost certainly be a different version of "pure" than 
things you can evaluate at compile time. Or, as in our previous attempts 
at defining it, things that won't break existing borrows or existing 
typestates. Each of these is a static approximation of the 
set-of-all-things-a-function-might-do. Since our functions can generally 
do quite a lot, the set of possible subsets you might mean by "pure" is 
correspondingly much larger.

> The typestate system did seme very complex but isn't there a middle
> ground via annotations perhaps?  A subset of primitives and core
> functions can be annotated as pure and then any function calling only
> pure functions can itself be annotated as pure.

This gets difficult fast. You wind up dividing your functions up into 
groups and then getting annoyed that one that's "mostly almost pure" or 
"essentially pure, for my purposes" that you wanted to call actually 
isn't (someone forgot to mark it as such, or some sub-function, or some 
trait-implied function) and then you can't. Or is pure using one way of 
thinking about purity, but not another. Or is pure except for the bit 
where it calls unsafe but promises it's going to maintain purity, just 
knows better than you (oops, that can't be done at compile time, nor on 
a GPU, etc.)

C++ has multiple concepts for this, each a not-entirely-obvious subset 
of the others, each affecting the others, and causing quite a lot of 
work just to get things to compile, much less reuse code.

They have const methods (don't mutate the object, unless you lie and 
override it) and constexpr (can be evaluated at compile time), and 
macros (can only operate on tokens), and template evaluation (can only 
operate on certain non-type args), and the openCL __kernel extension for 
GPU-applicable functions:

http://en.cppreference.com/w/cpp/language/constexpr
http://www.cplusplus.com/doc/tutorial/templates/
http://en.wikipedia.org/wiki/Const-correctness
http://en.wikipedia.org/wiki/C_preprocessor
http://en.wikipedia.org/wiki/OpenCL

"Which purity do you mean" is a very real question, not one you can just 
brush aside. The combinations are worse, in that they tend to cause (or 
require) subtyping relationships, that touch _everything_ in the 
language, from inference and mandatory syntax (which types get inferred 
when you just write a lambda?) to type checking (which calls are legal, 
which impls and delegations are legal) to codegen (which LLVM attributes 
are legal? which things can we inline and how?)

> pure functions that cannot be expressed this way, but using annotations
> and percolating effect tracking (similar to mutable fields in structs)
> seems like it shouldn't be too complex.

"It shouldn't be too complex" is sadly optimistic, and your analogy is 
telling: we removed mutable fields in structs as well. They added too 
much cognitive complexity when combined with inherited mutability via 
mutable owners (which people expect to "override" field mutability -- 
and rightly so).

A long time ago we had an effect system and we made pure the default 
(since we didn't want people accidentally leaving it out due to sloth) 
and we made the impure specifier a very small and reasonable keyword: 
"io". It was still a heavy complexity bill (required a full extra 
dimension of subtyping, parametricity, etc.) and _still_ had people 
breaking the rule with `unsafe`, which meant that the putative benefits 
like "can do compile time evaluation" or "can spread on a GPU" weren't 
there anyways. And people couldn't do simple things like "put a printf 
in here for logging" (much like in haskell).

Eventually people just took to making everything io, at which point it 
was a noise word and we decided to remove it (along with 'pred', which 
just meant pure, bool, and tracked by the typestate layer).

> Coming from the Haskell world, not having pure functions would be a
> considerable deficit in the language that is so close to being a best of
> both worlds descendant of C and Haskell.

The "direct descendant" aspect here is probably inaccurate. We're more 
of a descendant of ML with its mutable ref cells, eager evaluation and 
such. Haskell was willing to force its users to segregate all variants 
of impurity outside the most conservative intersection into monad 
representation[1]. But this costs heavily in all code that doesn't fit 
that intersection, which is a lot of systems code; so we've decided 
against it. The split is too much for C programmers to accept; anywhere 
we try to draw the line appears to cause a lot of anger and resentment 
over mandatory type-system fighting.

-Graydon


[1] This is why they could "implement STM in a weekend" -- by excluding 
almost all functions -- but I think this characterization is really 
unfair anyway. What they really have is just "do notation", which means 
constructing a suspended execution-tree is _slightly_ less miserable 
than a deeply nested tree of constructor calls and lambdas. But not 
really a lot. The interface-points with the rest of the language involve 
pervasive lifting, lowering, wrapping and unwrapping. See the "simple 
STM example" on their website: 
http://www.haskell.org/haskellwiki/Simple_STM_example



More information about the Rust-dev mailing list