[rust-dev] condition handling

Graydon Hoare graydon at mozilla.com
Fri Oct 19 16:48:37 PDT 2012


Hi,

If you're going to respond to this email, please read it in full, don't
just skim.

I've been sketching a module in core::condition this week that
implements a condition-handling system on top of the TLS feature
(task::local_data) that Ben Blum added over the summer.

This email is a very specific description of what I'm adding (many
people have asked) along with a very specific poll about how best to
structure the outermost part of the API.

I'd appreciate replies here be kept focused on those points --
clarifying questions + indication of preference on API -- as there are a
lot of new people on the list since we last discussed this, but we are
_not_ changing plan here. This is the same plan we've had for 6-ish
years, off-and-on, sketched numerous times, whenever it seems
implementable with the pieces lying around. I'm not interested in having
a big discussion of all the possible ways condition-systems can be
built, or (re-)exploring the design-space.

Some references to the lurking plan here:

https://mail.mozilla.org/pipermail/rust-dev/2012-May/001879.html
https://mail.mozilla.org/pipermail/rust-dev/2011-November/000999.html
https://github.com/mozilla/rust/issues/1945#issuecomment-4425610

Anyway, here is the system (again) in a nutshell:

  - Tasks have a task-local storage system (TLS). TLS maps keys to
    values that stick to the task until released. The keys are
    addresses, conventionally static addresses of items (this is a
    hack for producing 'unique' keys, but it works).

  - Handlers are stack-closures that get pushed and popped from TLS.
    They obey stack discipline and can capture their lexical environment
    since they'll be popped before the pushing frame exits.

  - Handlers are put in TLS keyed to a condition. A condition is a
    static, symbolic value used mostly for its TLS-keying and
    method-dispatching capabilities. Conditions don't store values,
    they just name situations that might occur and give-type-to any
    values passed-to and returned-from potential handlers of such
    situations.

  - Code can raise a condition whenever it likes. No static checking
    of which conditions might occur where; no relationship to function
    types. Raising of a condition in a module is indicated by the
    condition being present in the module, by convention, or by
    documentation.

  - Raising a condition can pass a value giving further details on
    the condition, beyond just its type/name. If the raise is handled,
    a value may return. The types of values raised-to and returned-from
    handlers is specified by a condition. That is, a condition is
    declared like "const OutOfKittens : Condition<T,U> = ..." where
    T is the type of values passed to an OutOfKittens handler, and
    U is the type of values returned from an OutOfKittens handler when
    it is returning a solution to the condition. U might well be a
    symbolic type such as an enum of reasonable recovery strategies.

  - There is no change to the fail/unwind system. Failure is still
    unrecoverable, results in unwinding, and is the only possible
    cause of unwinding. If a condition is handled, it doesn't unwind.
    If it's unhandled, or if the handler fails, the task fails and
    unwinds. This is not an unwind-rewind system or an unwind-catch
    system.

  - Condition.raise is a normal function and does something very simple:
    - look in TLS to see if there's a handler
    - if so, call it and return the result to the raiser
    - if not, fail

  - This means condition-handling happens _at site_ of raising. If
    the handler returns a useful value, processing continues as if
    nothing went wrong. It's _just_ a rendezvous mechanism for an
    outer frame to dynamically provide a handler closure to an inner
    frame where a condition occurs.

Given that description, astute readers may note that there's not much to
this "system", mostly just convention and existing pieces. This is
intentional. It's supposed to be very minimal and only took about a day
to get working, beyond the miserable slog Ben already fought through
implementing TLS over the summer.

Nonetheless I want to get it _right_ since this is probably going to be
a convention that sticks and gets used extensively across our libraries.
Getting it nailed down is a priority for the 0.5 cycle, as we're trying
to knock out the top few remaining toe-stubs when writing libraries, and
this is one of them. We've grown an unfortunate dependence on using the
core::Result type for managing recoverable-problems in library code and
I would like to move away from that in as many cases as practical, since
at-site handling tends to be much cleaner for users. Since 'Result' is
so clunky we wind up using unrecoverable 'fail' in a lot of cases too,
sometimes cases where having the option to recover makes sense.

So: API survey. Modulo a few residual bugs on consts and function
regions (currently hacked around in the implementation), I have 3
different APIs that all seem to work -- which I've given different names
for discussion sake -- and I'm trying to decide between them. They
mostly differ in the number of pieces and the location and shape of the
boilerplate a user of the condition system has to write. My current
preference is #3 but I'd like a show of hands on others' preferences.

Here they are:

1. "Protect/handle", a.k.a. "make it look like try/catch": this version
pursues the arrangement with protected-code coming first and handlers
coming second. To do this, the protected-code has to be suspended in a
closure, which is then passed to a temporary struct, which is
responsible for both capturing the remaining handler-part and firing the
push/run/pop cycle together at the end. Uses of it look like this:

  let b = do OutOfKittens.protect {
      do_some_stuff();
      that_might_raise();
      out_of_kittens();
  };

  do b.handle |t| {
      UseAardvarksInstead
  }

It's not really an ideal use of our 'do' syntax, nor are many other
variants of this API I've tried. It might read slightly better (or
worse) as such:

  (do OutOfKittens.protect {
      do_some_stuff();
      that_might_raise();
      out_of_kittens();
  }).handle(|t|
      UseAardvarksInstead
  );


2. "Guard", a.k.a. "make it look like RAII": this version has only a
single "floating in space" handler-block that is assigned to a temporary
guard object in the current scope, and all subsequent code in that scope
is protected. Reads like this:

  {
      let _g = OutOfKittens.guard(|t| UseAardvarksInstead);
      do_some_stuff();
      that_might_raise();
      out_of_kittens();
  }

Downsides of this mechanism are that a user might try to move _g out of
the current frame or otherwise tamper with it -- we might need to do
some tricks with the region system to prevent that -- and it gives rise
to this somewhat artificial _g variable and otherwise unclear extra
block to contain it.

3. "Trap/in", a.k.a. "the system I like the look of best": this is the
version Patrick suggested yesterday. It sets up the handlers in the head
of a do-block and then invokes the protected code. It's order-inverted
from protect/handle -- the handlers come before the protected code --
but it still reads ok to my eyes, and seems to play nicely with our
existing syntax:

  do OutOfKittens.trap(|t| UseAardvarksInstead).in {
      do_some_stuff();
      that_might_raise();
      out_of_kittens();
  }

Opinions? Clarifying questions?

Thanks,

-Graydon


More information about the Rust-dev mailing list