Mark S. Miller
erights at google.com
Thu Jul 10 10:19:26 PDT 2014
On Thu, Jul 10, 2014 at 9:54 AM, JF Bastien <jfb at chromium.org> wrote:
> Here are a few thoughts about denorms (a.k.a. subnormals as of 2008) from
> a discussion a few months ago with John Mccutchan, Dave Herman, Luke Wagner
> and Dan Gohman.
> A few facts to start off with:
> - The current SIMD proposal doesn't specify how denormals behave.
> - ECMA-262 specifies denormal behavior in 8.5: "The remaining
> 9007199254740990 (that is, 2^53-2) values are denormalised, having the form
> - Most CPUs support dernomals, but they're often slow (think 10x–100x
> slower than a single FP instruction).
> - Other hardware like GPUs don't support denormals.
> This triggered an odd thought. Rather than proceeding under the assumption
a) 64bit with denormals
b) 64bit without denormals
c) 32bit with denormals
d) 32bit without denormals
are all needed, what if we assume that only #a and #d are needed enough to
bother with. These are the two extremes:
a) gimme all the precision that IEEE double precision was specced to
provide, and that normal non-GPU hardware was built to provide
d) Damn the precision; full speed ahead. Gimme the full speed that all
stock FPUs can provide, whether on CPUs or GPUs, at whatever cost in
precision that lowest common denominator demands.
This doesn't address the question of how we should meet these needs. But it
might simplify the question. So: Are there compelling enough use cases for
#b and #c that we should care about them?
> - ARM NEON doesn't support denormals, and instead flushes to zero,
> which means that all SIMD operations aren't denormalized whereas all scalar
> operations are.
> - A64 does support denormals, but they're not necessarily fast.
> - Most CPUs allow setting denormals-are-zero and/or flush-to-zero as a
> floating-point state, affecting SIMD too, but that instruction is often
> slow or serializes the FP pipeline.
> My opinion is that denormals aren't something people are asking for.
> They're actually quite a surprise to most people, who learn about them the
> hard way when their application slows to a crawl, and then they learn about
> DAZ/FTZ and everything is fine again.
> I think the current state of hardware makes it prohibitive to mandate that
> denormals be supported outright for scalars, and makes it impossible to
> mandate denormals for SIMD since ARMv7 is a major CPU ISA that doesn't
> support denormals for SIMD (pure scalar would therefore have to be used).
> There's a further issue with adding SIMD in JS where the temporary
> polyfills will use scalar instructions, so if denormals exist in scalar but
> not SIMD then the polyfills are slightly wrong. I think it's also quite
> prohibitive to change the CPU's FP state to have DAZ/FTZ on/off between
> scalar and SIMD, and allowing the user to annotate their source with
> FTZ-on/FTZ-off will lead to surprising performance pitfalls on some
> hardware, and will lead to weird coding where scalar FP and SIMD can't be
> My conclusion is therefore that denormals should be left as unspecified
> for *both* scalar and SIMD. Change the spec for scalars, and leave as-is
> for SIMD.
> Realistically implementation will set DAZ/FTZ for the foreseeable future,
> and if people ever clamor for them then denormals can be brought back,
> assuming the hardware landscape is better in the future.
> FWIW there's a similar problem with NaNs, which AFAIK are left as
> es-discuss mailing list
> es-discuss at mozilla.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the es-discuss