[rust-dev] Concurrency and synchronous blocking

Brian Anderson banderson at mozilla.com
Fri Jul 13 15:34:30 PDT 2012


On 07/13/2012 11:58 AM, David Bruant wrote:
> Le 11/07/2012 03:05, Brian Anderson a écrit :
>> On 07/10/2012 12:13 PM, David Bruant wrote:
>>> (...)
>> Tasks do have a memory cost, but it is theoretically quite low. Rust
>> tasks on linux currently have somewhere around 4K of overhead, and
>> that's around 3K more than we would like and think is possible. Most
>> of that is dedicated to the call stack, and there are a lot of
>> potential future optimizations to minimize the cost of creating a task
>> (by e.g. reusing them).
> I haven't implemented a JavaScript event loop, but for comparison,
> that's more or less a list where each element is a function with some
> arguments. Very likely less than 1k (but with certainly other downsides)
>
>> The intent is that you should not have to think about whether using a
>> task is too expensive, because it is cheap.
>> I'm not familiar with how the JavaScript event loop works but I
>> imagine that it has similar responsibilities to the Rust scheduler.
> The idea is that a JavaScript "processing unit" (sorry, I don't know the
> correct term for this) has a stack and a message queue (the list I
> mentionned above). Each message is a function and some arguments. This
> function is called and when the call (and the nested ones) are complete,
> the next message is processed. There is no preemption. (there are some
> weird not-very-standards cases in browsers that break that rule).
> The message queue would be the equivalent of the Rust scheduler. I
> realized after posting my message that goroutines and Rust tasks could
> actually be implemented with a message queue (I'm not asking for that, I
> just realized it) with the difference that if some code gets stuck in an
> infinite loop, everything else is blocked and that may not be a good
> thing for a system language. I'm not sure. Maybe "some" preemption and
> the ability to kill a message (or the message being processed) could
> compensate.

To be clear, Rust's scheduling is cooperative, so a badly-behaving Rust 
task can block and prevent others from making progress. This is mostly 
an issue to be aware of for native bindings.

>
>> Instead of calling callbacks in response to events though the Rust
>> scheduler resumes execution of tasks.
> I've been thinking about this particular point and about my experience
> with node.js.
> Node.js has been critized a lot for not enabling parallelism by defaut.
> By default, your code runs in one system process (it's possible that
> some functions use several system threads under the hood, but you can't
> do that in JavaScript code), but you can fork/spawn other processes.
> There is also a way to know the number of cpus in your machine
> http://nodejs.org/api/os.html#os_os_cpus
> My experience is that it feels very right to spawn as many processes
> than you have CPUs (in case your program has a need for such a thing).
> In that case, you're not leaving some CPUs unused, but at the same time,
> you don't have a lot of processes/threads that you hardware can't run
> and that cost memory and scheduling.
>  From what I understand, this level of control cannot be achieved with
> goroutines and Rust tasks. The only primitive means "create a new
> concurrency unit and let the system figure it out". The downside of that
> is that the Rust runtime need to create a lot of stacks and do some
> scheduling itself. It sounds like it costs more than what can be done in
> the Node.js model.
> I don't have the perfect solution here, but there is certainly a
> middleground to be found.

Rust does offer some more control than that. The runtime supports 
running multiple schedulers simultaneously - particularly so code that 
wants to truly block an OS thread (like libuv) can live in its own world.

The scheduler that the main task (and any child tasks) runs in has by 
default the same number of threads as there are cores available. For 
more control you can create your own schedulers:

     let cores = get_num_cores(); // This functions doesn't exist yet

     for cores.times {
         do spawn_sched(single_threaded) {
             run_program()
         }
     }

If run_program never spawns another task then they will effectively have 
their own OS thread. They will occasionally do yield checks and context 
switch back to the scheduler momentarily, but that context switch could 
conceivably be optimized out if there is only one task on the entire 
scheduler.

>
>
>>
>>> (...)
>>>
>>> So I have several questions regarding Rust:
>>> * Is synchronous blocking possible?
>>
>> I don't understand the term 'synchronous blocking' (as opposed to just
>> 'blocking').
> It's the same thing. I use both terms interchangeably. Sorry for the
> confusion.
>
>> Receiving a value from a Rust port does block a task. Sending on a
>> channel does not (whereas Go channels do block on send). We consider
>> channels to be asynchronous, based on the sending behavior (vs. Go's
>> synchronous channels).
>>
>>> * How does Rust deal with concurrent tasks synchronization?
>>
>> Channels are the primary synchronization primitive in Rust.
>>
>>> * How would you write the above example in Rust?
>>
>> I would basically write it like the Go example. If it didn't have to
>> also wait for the timeout then I would instead use a vector of futures.
> I would use promises (equivalent of futures) as well. Is there a
> future/promise library in Rust?

core::future exists, but could be better. In particular, futures are not 
sendable types, which severely limits how they can be composed.

>
>> Here is that Go code translated to current Rust.
>>
>> (...)
> Thanks for this example :-)
>
>>> * Do you think it's satisfying in terms of expressiveness?
>> No, but not for the reasons you suggest. Rust's split between ports
>> and channels cause a lot of boilerplate, and the lack on an N-ary
>> select function or control structure is a big omission. Rust's
>> libraries in general need to be designed better.
> Is it shared by the Rust community? How would you move forward from that
> situation?

I think that opinion is shared, but most Rust developers are waiting 
until the language settles down before focusing on libraries.

As to the ergonomics of channels in Rust, I'm not sure what the solution 
is yet, but the existing channel implementation will likely be going 
away entirely. Eric Holk is working on a new primitive communication 
type called a 'pipe' that only does 1:1 communication, and can only send 
a single message. This alone is much more difficult to use that the 
current channels, but is also much faster.

On top of that he is building channel contracts that define a protocol 
between two pipe endpoints that is enforced by the type system. Channel 
contracts additionally allow bounded protocols (ones that don't just 
send forever without receiving) to be implemented with fixed size 
buffers so that sending a message is never forced to allocate. There 
will be default implementations of common protocols.

On top of all that I'm hoping that we define a new channel type that 
does N:M communication, is sendable, and doesn't require the 
port/channel distinction (like Go you can send and receive on the same 
object). For simple things you would probably just use a channel, but 
for performance you would use pipe protocols.

-Brian


More information about the Rust-dev mailing list