[rust-dev] Segmented stacks (was: IsRustSlimYet (IsRustFastYet v2))

Patrick Walton pwalton at mozilla.com
Thu Jul 4 15:33:03 PDT 2013

On 7/4/13 12:58 PM, Daniel Micay wrote:
> You can create many threads with fixed stacks, they just start off
> using 4K instead of however much smaller our segmented stacks will be.
> A scheduler will just be more expensive than a regular lightweight
> task.
> The 15-100% performance hit from segmented stacks pushes Rust well out
> of the systems language niche. I think it does have to change if Rust
> plans on ever fitting in the niche that C, C++ and D do.

I agree. The sole benefit of segmented stacks on modern OS's that lazily 
allocate pages is that, on 32-bit, you can avoid running out of address 
space with many tasks. This is counterbalanced by these disadvantages:

1. There is no way for the compiler or runtime to know ahead of time how 
much stack any given task will need, because this is based on dynamic 
control flow.

2. The consequence of overshooting (choosing a stack size that is too 
big) is that the benefit above is reduced.

3. The consequence of undershooting (choosing a stack size that is too 
small) is disastrous performance. In the limit, the performance degrades 
to something like what many Schemes and SML/NJ do, in that stack frames 
are malloc'd from the heap. Except that Scheme and SML/NJ have precise 
generational garbage collectors with bump allocators in the nursery, and 
we have C malloc(). Furthermore, stack segment allocation is the slow 
path in C malloc, because it's in a high storage class. So performance 
becomes abysmal in the slow path. Unlike systems like Erlang and Cilk, 
there is no way to relocate stack segments in Rust because of unmanaged 
interior pointers: Erlang could at least in theory correct its mistakes 
and keep stacks contiguous (although I don't know if the implementation 
does). So the best we can do is cache and hope for the best--but too 
much caching increases memory usage and decreases the benefits of stack 

4. The benefit above is significantly reduced when calling into C code, 
and all solutions to this either hurt the benefit more or significantly 
penalize the FFI.

I think that segmented stacks just don't work. *Relocatable* stacks may 
work, but not in Rust. From what I have read, Walter Bright and Rob Pike 

At this point I'd like to suggest just allowing the user to choose a 
stack size on a per-task basis, and failing if the stack size is 
exceeded. Basically `__morestack` would turn into `fail`.

Brian has pointed out to me that, currently, running out of stack has to 
abort the whole process, because the DWARF unwinder doesn't consider 
`__morestack` a "may-throw" position and as a result the arguments would 
leak. There are a number of ways we could fix this, ranging from 
principled to hacky. But for now I think aborting on stack exhaustion 
wouldn't be the end of the world (although others may disagree).


More information about the Rust-dev mailing list