Efficient 64 bit arithmetic
Filip Pizlo
fpizlo at apple.com
Tue Jul 8 08:45:00 PDT 2014
I like this a lot.
It looks like it will have a fairly mellow performance cliff in cases where VM optimizations fail for whatever reason, since even in a lower-tier compiler incapable of sophisticated optimizations all you really have to do is make sure that these new math function calls bottom out in something efficient.
This would be even better if there was a story for div and mod. Is there one?
-Filip
> On Jul 8, 2014, at 7:59 AM, Fabrice Bellard <fabrice at bellard.org> wrote:
>
> Hi all,
>
> Regarding 64 bit arithmetic, there is a simpler solution compared to what was proposed in [1]. It is already possible to get the low 32 bit part of the 64 bit operations for addition, subtraction and multiplication. So it is enough to add a few more functions to get the high 32 bit part of 64 bit additions, subtractions and multiplications. Hence I propose to add the following functions:
>
> Math.iaddh(lo0, hi0, lo1, hi1) : return the high 32 bit part of the 64 bit addition of (hi0, lo0) and (hi1, lo1).
>
> Math.isubh(lo0, hi0, lo1, hi1) : return the high 32 bit part of the 64 bit subtraction of (hi0, lo0) and (hi1, lo1).
>
> Math.imulh(a, b) : return the high 32 bit part of the signed 64 bit product of the 32 bit numbers a and b.
>
> Math.imuluh(a, b) : return the high 32 bit part of the unsigned 64 bit product of the 32 bit numbers a and b.
>
> All these functions convert their argument to 32 bit integers. They return a signed 32 bit integer.
>
> With these functions, the 64 bit operations are easy to implement :
>
> - (hi_res, lo_res) = (hi0, lo0) + (hi1, lo1) (64 bit addition):
>
> lo_res = (lo0 + lo1) | 0;
> hi_res = Math.iaddh(lo0, hi0, lo1, hi1);
>
> - (hi_res, lo_res) = (hi0, lo0) - (hi1, lo1) (64 bit subtraction):
>
> lo_res = (lo0 - lo1) | 0;
> hi_res = Math.isubh(lo0, hi0, lo1, hi1);
>
> - (hi_res, lo_res) = a * b (signed 64 bit product of 32 bit integers):
>
> lo_res = Math.imul(a, b);
> hi_res = Math.imulh(a, b);
>
> - (hi_res, lo_res) = a * b (unsigned 64 bit product of 32 bit integers):
>
> lo_res = Math.imul(a, b);
> hi_res = Math.imuluh(a, b);
>
> - (hi_res, lo_res) = (hi0, lo0) * (hi1, lo1) (signed 64 bit product of 64 bit integers):
>
> lo_res = Math.imul(lo0, lo1);
> hi_res = (Math.imulh(lo0, lo1) + Math.imul(lo0, hi1)) | 0;
> hi_res = (hi_res + Math.imul(lo1, hi0)) | 0;
>
> - (hi_res, lo_res) = (hi0, lo0) * (hi1, lo1) (unsigned 64 bit product of 64 bit integers):
>
> lo_res = Math.imul(lo0, lo1);
> hi_res = (Math.imuluh(lo0, lo1) + Math.imul(lo0, hi1)) | 0;
> hi_res = (hi_res + Math.imul(lo1, hi0)) | 0;
>
> It is easy for the compiler to optimize the code because only 32 bit integers are used and because the functions have no side effect. Even if the compiler does not remove the duplicate operation for the low 32 bit parts, the overhead is very small on a 32 bit CPU (1 more instruction than the optimal code).
>
> Fabrice.
>
> [1] https://www.mail-archive.com/es-discuss%40mozilla.org/msg27237.html
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
More information about the es-discuss
mailing list