[rust-dev] Appeal for CORRECT, capable, future-proof math, pre-1.0
owen.shepherd at e43.eu
Sat Jan 11 08:54:41 PST 2014
On 11 January 2014 06:20, Daniel Micay <danielmicay at gmail.com> wrote:
> The branch on the overflow flag results in a very significant loss in
> performance. For example, I had to carefully write the vector `push`
> method for my `Vec<T>` type to only perform one overflow check. With
> two checks, it's over 5 times slower due to failed branch predictions.
What did the generated code look like? I suspect that LLVM wasn't
generating optimal code, perhaps because Rust wasn't giving it appropriate
hints or because of optimizer bugs. For reference, on AMD64 the code should
look something like the following hypothetical code:
MOV $SIZE, %eax
ADD $HEADER_SIZE, %rax
MOV %rax, %rsi
// Code to raise error here
Note that the ordering is EXTREMELY important! x86 doesn't give you any
separate branch hints (excluding two obsolete ones which only the Pentium
IV ever cared about) so your only clue to the optimizer is the branch
I suspect your generated code had forward branches for the no overflow
case. Thats absolutely no good (codegen inerting "islands" of failure case
code); it will screw up the branch predictor.
x86 defaults to predicting all (conditional) forward jumps not taken, all
conditional backwards jumps taken (Loops!). If the optimizer wasn't
informed correctly, it will probably not have obeyed that.
Being as the overflow case should basically be never hit, there is no
reason for it to ever be loaded into the optimizer, so that is good
(P.S. If the rust compiler is really good it'll convince LLVM to put the
error case branch code in a separate section so it can all be packed
together far away from useful cache lines and TLB entries)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Rust-dev