[rust-dev] net::tcp::TcpSocket slow?

Patrick Walton pwalton at mozilla.com
Thu Dec 20 20:17:42 PST 2012

I just profiled this. Some thoughts:

On 12/20/12 9:12 PM, Brian Anderson wrote:
> First, stack switching. Switching between Rust and C code has bad
> performance due to bad branch prediction. Some workloads can spend
> 10% of their time stalling in the stack switch.

This didn't seem too high, actually. It should only be ~20,000 stack 
switches (read and write) if we solve the following issue:

> Second, working with uv involves sending a bunch of little work units to
> a dedicated uv task. This is because callbacks from uv into Rust *must
> not fail* or the runtime will crash. Where typical uv code runs directly
> in the event callbacks, Rust dispatches most or all of that work to
> other tasks. This imposes significant context switching and locking
> overhead.

This is actually the problem. If you're using a nonblocking I/O library 
(libuv) for a fundamentally blocking workload (sending lots of requests 
to redis and blocking on the response for each one), *and* you're 
multiplexing userland green threads on top of it, then you're going to 
get significantly worse performance than you would if you had used a 
blocking I/O setup. We can make some of the performance differential up 
by switching uv over to pipes, and maybe we can play dirty tricks like 
having the main thread spin on the read lock so that we don't have to 
fall into the scheduler to punt it awake, but I still don't see any way 
we will make up the 10x performance difference for this particular use 
case without a fundamental change to the architecture. Work stealing 
doesn't seem to be a viable solution here since the uv task really needs 
to be one-task-per-thread.

Maybe the best thing is just to make the choice of nonblocking versus 
blocking I/O a choice that tasks can make on an individual basis. It's a 
footgun to be sure; if you use blocking I/O you run the risk of starving 
other tasks on the same scheduler to death, so perhaps we should 
restrict this mode to schedulers with 1:1 scheduling. But this would be 
in line with the general principle that we've been following that the 
choice of 1:1 and M:N scheduling should be left to the user, because 
there are performance advantages and disadvantages to each mode.

Once this sort of switch is implemented, I would suspect the performance 
differential between Ruby and Rust to be much less.


More information about the Rust-dev mailing list