jfb at chromium.org
Thu Jul 10 13:03:11 PDT 2014
> So: Are there compelling enough use cases for #b and #c that we should
> care about them?
I don't think so, but Jens seems to disagree. Simulations and
root-finding problems in my experience use smaller precision numbers
(e.g. i16, f16 or f32) to hillclimb close to the solution faster, and
then user bigger precision numbers (f64) to nail down the actual
solution. Using denormals here kind of defeats the purpose.
> Not true, AMD and nVidia GPUs support denormals. Anything that supports
> double-precision in OpenCL will support denormals.
You're correct, I wasn't precise in this: denormal support is new to
GPUs, and is also fraught with peril. It's a pretty similar situation
to that on ARM.
> I think programmers are surprised when denormals are slow (but only DSP-type
> code tends to produce these issues.)
My experience, and what I've heard from other people, has been that
denormals are seen in an unexpected way much more often that with DSP
> On the other hand, I think most programmers would be surprised if x - y = 0
> didn’t mean x == y too.
This is floating point. People will also be surprised when (x / y) * y
!= x. I think this is a strawman argument: transcendentals' lack of
precision are much more surprising than this IMHO.
> Sure, allow flush-to-zero for SIMD. It doesn’t break old code.
The case I'm making is that mandating FTZ for SIMD but keeping
denormal support for scalars is a bad idea because it implies changing
FP state back and forth which has significant performance implications
and dissuades folks from writing intuitive code that has SIMD and FP
operations. It's also not much of an option when SIMD is polyfilled.
More information about the es-discuss