[rust-dev] Scheduler and I/O work items for the summer
banderson at mozilla.com
Thu May 30 19:42:00 PDT 2013
Allow me to present the status of our ongoing quest to rewrite the task
scheduler, along with the major work items remaining. The results so far
are encouraging but there is a very large amount of work left,
particularly regarding I/O. In addition to myself we'll have two interns
working on these areas this summer, but we could use more help still.
This is an especially good opportunity to influence the way I/O works in
Rust. I'm hoping that we will cut over to the new scheduler in June, but
expect that crucial I/O-related work will continue for most of the year.
At the moment we have a multithreaded task scheduler that integrates
non-blocking TCP built on top of libuv. So far it uses a very basic
scheduling strategy that employs several contended locks, but most of
the components of the full algorithm are in place, just waiting to be
filled in. It is expected that once we're done the entire scheduler will
be lock-free. Besides the aforementioned locking it also allocates far
too much at the moment but that's not a limitation of the design. As far
as the scheduler goes I have not run into any major surprises and still
expect it to be significantly more efficient than the current one. The
biggest concern about scheduling is that our requirements force our
scheduler to do more synchronization than specified by the work stealing
algorithm alone. Whereas the literature describes work stealing as only
synchronizing on the work stealing deque, we also have message passing
between schedulers and a mechanism to put individual schedulers that
can't find work to sleep and wake them later, both of which require
further synchronization. This expense can be mitigated in some cases,
but not all.
As I've been working on the scheduler I have begun separating the task
and its services from the coroutine task scheduler with the intent that
we can have Rust tasks that are not green threads but instead regular
threads with no userspace scheduling overhead at all. This has ripple
effects throughout the standard library, particularly with the
concurrency primitives, and I don't expect this to reach feature parity
with green thread tasks for a long time, but removing the green thread
requirement allows us to make a stronger case still for being a true
Most of my work on the I/O stack has been in specifying the main I/O
traits and building up the multi-layer interface between the
public-facing I/O library and libuv. I think I've sufficiently proven
the strategy of using the scheduler to convert async I/O to sync I/O,
but there's a whole lot more to implement and there are a number of
outstanding design problems to solve. We've previously discussed here
how I/O should do [error handling]. The feedback in that thread was
great, but it is not yet reflected in the current implementation. I have
though introduced a `read_error` condition specifically for the `read`
method and all extensions that build upon it, but it is not fleshed out.
What worries me the most about the entire endeavour is 'select'. We have
great need for some facility to wait on multiple types of events
(particularly I/O and ports) simultaneously, but the requirements can be
rather complex (detailed later). I am not sure that the old unix
'select' function (as we used in pipes) is the best abstraction for this
and feel we need to do further research on this topic. I would like to
start prototyping something here soon.
I've previously done two experiments with microbenchmarks of TCP [read
performance] and single-threaded [scheduling performance] and claimed
that the results were encouraging. Of course things will change a lot as
we implement multi-threading and move on to better benchmarks. I'm
maintaining a selection of comparative [benchmarks] in an external repo
that are currently a bit out of date.
I don't know that I recommend using the new scheduler yet for purposes
other than scheduler development, but it can be turned on by setting the
RUST_NEWRT environment variable. At the moment this will set up a
single-threaded scheduler only but I'll soon convert this to a
multi-threaded scheduler. For simple programs you shouldn't see any
difference in execution, but some library features are still busted.
Last I checked 95% of the run-pass tests succeeded with RUST_NEWRT set.
The [main issue] for the entire scheduler rewrite is #4419. Within that
one there is a description of the design and links to other related topics.
[main issue]: https://github.com/mozilla/rust/issues/4419
The remainder of this email describes the most significant remaining
## Add remaining implementations of I/O traits
core::rt::io defines several traits for synchronous I/O, including
Reader and Writer. We have a non-blocking TCP implementation in
core::rt::io::net::tcp but that's it. We need non-blocking
implementations for files, UDP, unix pipes, then also blocking
implementations of the same, based not on uv, but on plain file
descriptors and sockets.
## Design string encoding and decoding for Reader/Writer traits
How do we deal with string encoding of I/O? The existing implementation
uses extension methods on Readers and Writers, but this is not
sufficient because it doesn't maintain any state. Need a better
understanding of the requirements here, but these might involve new
## Design and implement some solution for select / async events
We need a way to efficiently wait on multiple types of events at once,
including port receives, I/O reads, socket accepts, timers. This has
some very complicated requirements to satisfy, as detailed in the linked
issue, and I'm not sure what the right abstractions are here. This is
super important and the biggest risk to the whole effort. If anybody has
opinions about this topic I would love to hear them.
## Make I/O threadsafe
I/O types must perform I/O on the scheduler on which they were created,
but they are also sendable. This means that when we perform I/O we must
check that we are on the correct scheduler, and if not then reschedule
the running task. This complexity also infects 'select' and could
conceivably lead to some untenable situations at runtime that can do
nothing but `fail!`.
Need to create non-blocking access to the global resources
stdin/stdout/stderr. Currently I'm thinking these will be Readers and
Writers backed by ports, with some protocol for obtaining exclusive access.
## Port existing core::io users to core::rt::io::native
In preparation for removing core::io we need to start porting existing
users to the blocking implementations (which don't yet exist) of the new
I/O API. This will involve identifying and porting missing features and
completing various other I/O tasks.
## Lock free data structures
There are several concurrent data structures used in the scheduler that
are currently implemented with locks and need to be reimplemented
without because they are heavily contended. The easiest of these are the
MessageQueue and the SleeperList. MessageQueue is a multiple-producer,
single-consumer unbounded queue used for sending messages between
schedulers. SleeperList is a multiple-producer, multiple-consumer
bounded stack used to track which schedulers are 'asleep'.
## Work stealing
Multithreading is currently not implemented using work stealing, but
instead using a shared work queue. Adding work stealing will require
converting WorkQueue to a deque and adding the 'thief' phase of the work
stealing algorithm to the scheduler. Locating work queues to steal from
will involve creating further lock-free data structures. Some ideas are
outlined on the issue tracker.
James Miller has an implementation of a lock free deque that we can use
for this. Multiple people have interest in this topic so let's make sure
## Implement stack growth
We need to make the new tasks support segmented stacks. For the most
part this will involve copying lots of fiddly bits from the previous
implementation, but I want to make the caching story in this
implementation simpler, with each scheduler having a single stack pool,
instead of having some of the stacks originate in the scheduler and some
in the task. This will be easier once fast_ffi is finished, but it will
likely require adding a new attribute to LLVM to suppress the segmented
stack function prologue.
## Remove the old scheduler
We can probably get this done relatively soon. There are some features
not implemented yet and some unimplemented features that we can just
drop, at least temporarily (pipes select). This can be done even before
finishing I/O, since the blocking core::io will continue working fine.
## Implement a simple HTTP client/server library
I really want to be able to demonstrate a fast and convenient HTTP library.
If you've read this far then thanks for your time. I'm giving you a
virtual high five!
More information about the Rust-dev