[rust-dev] Idea: Memcpyed stacks for green tasks

Gábor Lehel glaebhoerl at gmail.com
Sun Jan 26 05:02:39 PST 2014


On Sat, Jan 25, 2014 at 9:26 PM, Vadim <vadimcn at gmail.com> wrote:

> Hi Bill,
> If I understand this right, what you are proposing is the approach of
> Python's greenlets library, which I already tried to pitch here<http://mail.mozilla.org/pipermail/rust-dev/2013-November/006544.html>.
>
>
> I think that the biggest problem with this approach, in its' raw form, is
> that greenlets are bound to a specific OS thread, which would cause all
> sorts of headache.  For example:
> - It would be difficult to balance greenlet tasks between several OS
> threads,
> - If async I/O requests for two greenlets, bound to the same OS thread,
> were to complete simultaneously, you would not be able to resume them in
> parallel,
> - If a thread were to become blocked for a significant amount of time, all
> the greenlets bound to it become un-resumable,
> and so on...
>
> Greelets would be much more palatable if their stacks were
> position-independent.  This might be possible to achieve with a suitable
> LLVM transform<http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-January/069662.html>,
> though I am not yet sure how portable this would be.
>

Totally not an expert in this area, but I'm wondering two things:

 - Once you hoist all variables whose address is taken out into this
secondary stack frame, if you have a pointer on the stack to an object on
the stack who *also*, in turn, has its address taken, then I believe you
would end up with a pointer in the secondary stack frame to an object in
the secondary stack frame, i.e. the secondary stack frame itself would not
be position independent. Does the secondary stack frame always stay in
place, so we don't really care about this?

 - Instead of heap-allocating a secondary stack frame, could another
potential solution be to use similar infrastructure as for DWARF-based
unwinding and/or precise GC tracing of the stack, in other words to emit a
side table of the positions of pointers into the stack in each stack frame,
and then when relocating the stack, to consult this side table and manually
fix-up all of the affected pointers? In this case, aside from the presence
of the side tables, the cost is borne only when actually relocating the
stack.



>
>
> cheers,
> Vadim
>
>
>
> On Sat, Jan 25, 2014 at 8:58 AM, Bill Myers <bill_myers at outlook.com>wrote:
>
>> Stack management for green tasks has been based in the past first on
>> segmented stacks and then on standard large stacks.
>>
>> However, I just realized that there is a third alternative which might
>> well be better than both of those.
>>
>> The idea is very simple: a green task would run on a large stack like
>> now, but when it is descheduled, a memory buffer of the size of its used
>> stack space is allocated, and its stack data is copied there; when it is
>> rescheduled, the stack data is copied back to the original address.
>>
>> The copying can be performed either by memcpy (perhaps with a specialized
>> memcpy that assumes alignment and always uses SIMD instruction to copy with
>> no checks), or perhaps by using a compression algorithm designed for
>> maximum speed, such as Snappy or maybe a new ad-hoc design.
>>
>> This allows to have the best possible memory efficiency and thus the
>> maximum possible number of tasks, even better than segmented stacks due to
>> precise sizing and optional compression, while not adding any overhead to
>> code that does not block, either in terms of code generation complexity or
>> runtime cost.
>>
>> There is of course an additional runtime cost when blocking, but blocking
>> already often involves both system calls and context switching, and
>> workloads with lots of blocking green tasks are probably I/O bound anyway;
>> furthermore, stack data is probably in the CPU cache, so there shouldn't be
>> much penalty.
>>
>> The only significant drawback is that two green tasks running from the
>> same "large" stack (note that stacks cannot be easily switched, as that
>> would invalidate borrowed pointers on the stack to the stack) cannot
>> ever run concurrently; however, by making the number of large stacks a
>> large enough multiple of the number of CPU cores, the probability of
>> reduced parallelism due to this issue can be made as small as desired.
>>
>> In general, one would use an adaptive approach, where this kind of
>> copying would start to kick in only once the number of green tasks becomes
>> large; when not copying, one would just optionally use
>> madvise(MADV_DONTNEED) to trim unused whole pages of the stack.
>>
>> Also, if the data to be copied is large (several OS pages), one may use
>> mremap() or similar facilities to move the address space mappings instead
>> of the data itself.
>>
>> Some unsafe code that assumes that tasks can access each other's stacks
>> will break (probably only the new Mutex implementation), but that can be
>> fixed by putting the data in the task structure instead of the stack.
>>
>> Overall, this should allow writing something like a parallel network
>> server or web crawler in Rust in a natural blocking style, while being
>> fully confident in its scalability, which is something that is not really
>> possible at the moment.
>>
>> Once this is in place, the best practices for Rust programs would become:
>> 1. Use native tasks for tasks whose number is bounded at compile time and
>> small
>> 2. Use green tasks for tasks whose number is unbounded or large
>>
>> One could even automate this to some extent by spawning by default a
>> native task the first C times a given proc() code address is used to start
>> a task (where C = number of CPU cores), and green tasks afterwards.
>>
>> What do you think?
>> _______________________________________________
>> Rust-dev mailing list
>> Rust-dev at mozilla.org
>> https://mail.mozilla.org/listinfo/rust-dev
>>
>
>
> _______________________________________________
> Rust-dev mailing list
> Rust-dev at mozilla.org
> https://mail.mozilla.org/listinfo/rust-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/rust-dev/attachments/20140126/adbdbacb/attachment.html>


More information about the Rust-dev mailing list