[rust-dev] Appeal for CORRECT, capable, future-proof math, pre-1.0

Daniel Micay danielmicay at gmail.com
Sat Jan 11 11:33:02 PST 2014


On Sat, Jan 11, 2014 at 11:54 AM, Owen Shepherd <owen.shepherd at e43.eu> wrote:
> On 11 January 2014 06:20, Daniel Micay <danielmicay at gmail.com> wrote:
>>
>> The branch on the overflow flag results in a very significant loss in
>> performance. For example, I had to carefully write the vector `push`
>> method for my `Vec<T>` type to only perform one overflow check. With
>> two checks, it's over 5 times slower due to failed branch predictions.
>
>
> What did the generated code look like? I suspect that LLVM wasn't generating
> optimal code, perhaps because Rust wasn't giving it appropriate hints or
> because of optimizer bugs. For reference, on AMD64 the code should look
> something like the following hypothetical code:
>
> vec_allocate:
> MOV $SIZE, %eax
> MUL %rsi
> JC Lerror
> ADD $HEADER_SIZE, %rax
> JC Lerror
> MOV %rax, %rsi
> JMP malloc
> Lerror:
> // Code to raise error here
>
> Note that the ordering is EXTREMELY important! x86 doesn't give you any
> separate branch hints (excluding two obsolete ones which only the Pentium IV
> ever cared about) so your only clue to the optimizer is the branch
> direction.
>
> I suspect your generated code had forward branches for the no overflow case.
> Thats absolutely no good (codegen inerting "islands" of failure case code);
> it will screw up the branch predictor.
>
> x86 defaults to predicting all (conditional) forward jumps not taken, all
> conditional backwards jumps taken (Loops!). If the optimizer wasn't informed
> correctly, it will probably not have obeyed that.
>
> Being as the overflow case should basically be never hit, there is no reason
> for it to ever be loaded into the optimizer, so that is good
>
> (P.S. If the rust compiler is really good it'll convince LLVM to put the
> error case branch code in a separate section so it can all be packed
> together far away from useful cache lines and TLB entries)

Rust directly exposes the checked overflow intrinsics so these are
what was used. It already considers branches calling a `noreturn`
function to be colder, so adding an explicit branch hint (which is
easy enough via `llvm.expect` doesn't help). Feel free to implement it
yourself if you think you can do better. The compiler work is already
implemented.  I doubt you'll get something performing in the same
ballpark as plain integers.


More information about the Rust-dev mailing list