[rust-dev] Scheduler and I/O work items for the summer

Brian Anderson banderson at mozilla.com
Thu May 30 19:42:00 PDT 2013

Hi Rusties,

Allow me to present the status of our ongoing quest to rewrite the task 
scheduler, along with the major work items remaining. The results so far 
are encouraging but there is a very large amount of work left, 
particularly regarding I/O. In addition to myself we'll have two interns 
working on these areas this summer, but we could use more help still. 
This is an especially good opportunity to influence the way I/O works in 
Rust. I'm hoping that we will cut over to the new scheduler in June, but 
expect that crucial I/O-related work will continue for most of the year.

At the moment we have a multithreaded task scheduler that integrates 
non-blocking TCP built on top of libuv. So far it uses a very basic 
scheduling strategy that employs several contended locks, but most of 
the components of the full algorithm are in place, just waiting to be 
filled in. It is expected that once we're done the entire scheduler will 
be lock-free. Besides the aforementioned locking it also allocates far 
too much at the moment but that's not a limitation of the design. As far 
as the scheduler goes I have not run into any major surprises and still 
expect it to be significantly more efficient than the current one. The 
biggest concern about scheduling is that our requirements force our 
scheduler to do more synchronization than specified by the work stealing 
algorithm alone. Whereas the literature describes work stealing as only 
synchronizing on the work stealing deque, we also have message passing 
between schedulers and a mechanism to put individual schedulers that 
can't find work to sleep and wake them later, both of which require 
further synchronization. This expense can be mitigated in some cases, 
but not all.

As I've been working on the scheduler I have begun separating the task 
and its services from the coroutine task scheduler with the intent that 
we can have Rust tasks that are not green threads but instead regular 
threads with no userspace scheduling overhead at all. This has ripple 
effects throughout the standard library, particularly with the 
concurrency primitives, and I don't expect this to reach feature parity 
with green thread tasks for a long time, but removing the green thread 
requirement allows us to make a stronger case still for being a true 
'systems language'.

Most of my work on the I/O stack has been in specifying the main I/O 
traits and building up the multi-layer interface between the 
public-facing I/O library and libuv. I think I've sufficiently proven 
the strategy of using the scheduler to convert async I/O to sync I/O, 
but there's a whole lot more to implement and there are a number of 
outstanding design problems to solve. We've previously discussed here 
how I/O should do [error handling]. The feedback in that thread was 
great, but it is not yet reflected in the current implementation. I have 
though introduced a `read_error` condition specifically for the `read` 
method and all extensions that build upon it, but it is not fleshed out.

What worries me the most about the entire endeavour is 'select'. We have 
great need for some facility to wait on multiple types of events 
(particularly I/O and ports) simultaneously, but the requirements can be 
rather complex (detailed later). I am not sure that the old unix 
'select' function (as we used in pipes) is the best abstraction for this 
and feel we need to do further research on this topic. I would like to 
start prototyping something here soon.

I've previously done two experiments with microbenchmarks of TCP [read 
performance] and single-threaded [scheduling performance] and claimed 
that the results were encouraging. Of course things will change a lot as 
we implement multi-threading and move on to better benchmarks. I'm 
maintaining a selection of comparative [benchmarks] in an external repo 
that are currently a bit out of date.

I don't know that I recommend using the new scheduler yet for purposes 
other than scheduler development, but it can be turned on by setting the 
RUST_NEWRT environment variable. At the moment this will set up a 
single-threaded scheduler only but I'll soon convert this to a 
multi-threaded scheduler. For simple programs you shouldn't see any 
difference in execution, but some library features are still busted. 
Last I checked 95% of the run-pass tests succeeded with RUST_NEWRT set.

The [main issue] for the entire scheduler rewrite is #4419. Within that 
one there is a description of the design and links to other related topics.

[error handling]: 
[read performance]: 
[scheduling performance]: 
[benchmarks]: https://github.com/brson/rust-sched-bench
[main issue]: https://github.com/mozilla/rust/issues/4419

The remainder of this email describes the most significant remaining 
work items.

## Add remaining implementations of I/O traits

core::rt::io defines several traits for synchronous I/O, including 
Reader and Writer. We have a non-blocking TCP implementation in 
core::rt::io::net::tcp but that's it. We need non-blocking 
implementations for files, UDP, unix pipes, then also blocking 
implementations of the same, based not on uv, but on plain file 
descriptors and sockets.


## Design string encoding and decoding for Reader/Writer traits

How do we deal with string encoding of I/O? The existing implementation 
uses extension methods on Readers and Writers, but this is not 
sufficient because it doesn't maintain any state. Need a better 
understanding of the requirements here, but these might involve new 
decorator types.


## Design and implement some solution for select / async events

We need a way to efficiently wait on multiple types of events at once, 
including port receives, I/O reads, socket accepts, timers. This has 
some very complicated requirements to satisfy, as detailed in the linked 
issue, and I'm not sure what the right abstractions are here. This is 
super important and the biggest risk to the whole effort. If anybody has 
opinions about this topic I would love to hear them.


## Make I/O threadsafe

I/O types must perform I/O on the scheduler on which they were created, 
but they are also sendable. This means that when we perform I/O we must 
check that we are on the correct scheduler, and if not then reschedule 
the running task. This complexity also infects 'select' and could 
conceivably lead to some untenable situations at runtime that can do 
nothing but `fail!`.


## stdin/out/err

Need to create non-blocking access to the global resources 
stdin/stdout/stderr. Currently I'm thinking these will be Readers and 
Writers backed by ports, with some protocol for obtaining exclusive access.


## Port existing core::io users to core::rt::io::native

In preparation for removing core::io we need to start porting existing 
users to the blocking implementations (which don't yet exist) of the new 
I/O API. This will involve identifying and porting missing features and 
completing various other I/O tasks.


## Lock free data structures

There are several concurrent data structures used in the scheduler that 
are currently implemented with locks and need to be reimplemented 
without because they are heavily contended. The easiest of these are the 
MessageQueue and the SleeperList. MessageQueue is a multiple-producer, 
single-consumer unbounded queue used for sending messages between 
schedulers. SleeperList is a multiple-producer, multiple-consumer 
bounded stack used to track which schedulers are 'asleep'.


## Work stealing

Multithreading is currently not implemented using work stealing, but 
instead using a shared work queue. Adding work stealing will require 
converting WorkQueue to a deque and adding the 'thief' phase of the work 
stealing algorithm to the scheduler. Locating work queues to steal from 
will involve creating further lock-free data structures. Some ideas are 
outlined on the issue tracker.

James Miller has an implementation of a lock free deque that we can use 
for this. Multiple people have interest in this topic so let's make sure 
we coordinate.


## Implement stack growth

We need to make the new tasks support segmented stacks. For the most 
part this will involve copying lots of fiddly bits from the previous 
implementation, but I want to make the caching story in this 
implementation simpler, with each scheduler having a single stack pool, 
instead of having some of the stacks originate in the scheduler and some 
in the task. This will be easier once fast_ffi is finished, but it will 
likely require adding a new attribute to LLVM to suppress the segmented 
stack function prologue.


## Remove the old scheduler

We can probably get this done relatively soon. There are some features 
not implemented yet and some unimplemented features that we can just 
drop, at least temporarily (pipes select). This can be done even before 
finishing I/O, since the blocking core::io will continue working fine.


## Implement a simple HTTP client/server library

I really want to be able to demonstrate a fast and convenient HTTP library.


If you've read this far then thanks for your time. I'm giving you a 
virtual high five!


More information about the Rust-dev mailing list