[rust-dev] Segmented stacks (was: IsRustSlimYet (IsRustFastYet v2))
pwalton at mozilla.com
Thu Jul 4 15:33:03 PDT 2013
On 7/4/13 12:58 PM, Daniel Micay wrote:
> You can create many threads with fixed stacks, they just start off
> using 4K instead of however much smaller our segmented stacks will be.
> A scheduler will just be more expensive than a regular lightweight
> The 15-100% performance hit from segmented stacks pushes Rust well out
> of the systems language niche. I think it does have to change if Rust
> plans on ever fitting in the niche that C, C++ and D do.
I agree. The sole benefit of segmented stacks on modern OS's that lazily
allocate pages is that, on 32-bit, you can avoid running out of address
space with many tasks. This is counterbalanced by these disadvantages:
1. There is no way for the compiler or runtime to know ahead of time how
much stack any given task will need, because this is based on dynamic
2. The consequence of overshooting (choosing a stack size that is too
big) is that the benefit above is reduced.
3. The consequence of undershooting (choosing a stack size that is too
small) is disastrous performance. In the limit, the performance degrades
to something like what many Schemes and SML/NJ do, in that stack frames
are malloc'd from the heap. Except that Scheme and SML/NJ have precise
generational garbage collectors with bump allocators in the nursery, and
we have C malloc(). Furthermore, stack segment allocation is the slow
path in C malloc, because it's in a high storage class. So performance
becomes abysmal in the slow path. Unlike systems like Erlang and Cilk,
there is no way to relocate stack segments in Rust because of unmanaged
interior pointers: Erlang could at least in theory correct its mistakes
and keep stacks contiguous (although I don't know if the implementation
does). So the best we can do is cache and hope for the best--but too
much caching increases memory usage and decreases the benefits of stack
4. The benefit above is significantly reduced when calling into C code,
and all solutions to this either hurt the benefit more or significantly
penalize the FFI.
I think that segmented stacks just don't work. *Relocatable* stacks may
work, but not in Rust. From what I have read, Walter Bright and Rob Pike
At this point I'd like to suggest just allowing the user to choose a
stack size on a per-task basis, and failing if the stack size is
exceeded. Basically `__morestack` would turn into `fail`.
Brian has pointed out to me that, currently, running out of stack has to
abort the whole process, because the DWARF unwinder doesn't consider
`__morestack` a "may-throw" position and as a result the arguments would
leak. There are a number of ways we could fix this, ranging from
principled to hacky. But for now I think aborting on stack exhaustion
wouldn't be the end of the world (although others may disagree).
More information about the Rust-dev