use fma for tolerant comparisons

Question

use fma for tolerant comparisons

moon-chilled opened this issue 2 years ago · 10 comments

The tolerant compare uses a calculation of the form a < b*c. This can be done faster and with more precision by looking at the sign bit of fma(b,c,-a). But it breaks with negative zero.

Answer 1 · 2022-10-03T09:52:27.000Z

err, that should be fma(b,-c,a). fma(b,c,-a) is for <=. Because exact 0 or positive underflow will be positive 0; negative 0 can only be from negative underflow. (Assuming no negative 0 inputs.)

Answer 2 · 2022-10-03T13:15:13.000Z

I think negative 0 is not an issue since it will occur on the fringe of the tolerant range.

You need to verify that this will work for infinities, which may produce NaN intermediate values.

It is vital that TEQ, which is used for singletons, be updated to track the code used in eqDD.

Answer 3 · 2022-10-04T01:10:20.000Z

Tolerant a=b is

a>cct*b = b>cct*a

I want to replace it with:

signbit(cct*b-a) = signbit(cct*a-b)

But suppose a is _0, b is 0. Then we have:

signbit(cct*0-_0) = signbit(cct*_0-0)

signbit(0-_0) = signbit(_0-0)

signbit(0) = signbit(_0)

Which is false, but we want _0=0.

Good point about nan; looks like ieee doesn't specify the sign of a nan result. We want signbit(_+__) to be different from signbit(__+_); if we're lucky, x86 will spec it so the nan sign always comes from the right addend (or left; w/e). Will look later.

Can do something similar for the tolerant zero problem: replace x=y with (x<.y) = (x>.y). On x86, x>.y and x<.y are both y if x=y, so that ensures that, if both are zero, the arguments to the comparison will be the same zero. It's the same number of operations as the current implementation (replaces compares with max/min). (On arm, however, <. and >. consider 0 to be greater than _0, so this won't work there; bully for arm.)

Answer 4 · 2022-10-04T01:18:35.000Z

That's fatal to your idea if true. Are you sure it's true? Normally the result of addition is never -0. hhr

…

On Mon, Oct 3, 2022, 9:10 PM moon-chilled ***@***.***> wrote: Tolerant a=b is a>cct*b = b>cct*a I want to replace it with: signbit(cct*b-a) = signbit(cct*a-b) But suppose a is _0, b is 0. Then we have: signbit(cct*0-_0) = signbit(cct*_0-0) signbit(0-_0) = signbit(_0-0) signbit(0) = signbit(_0) Which is false, but we want _0=0. Good point about nan; looks like ieee doesn't specify the sign of a nan result. We want signbit(*+) to be different from signbit(+*); if we're lucky, x86 will spec it so the nan sign always comes from the right addend (or left; w/e). Will look later. Can do something similar for the tolerant zero problem: replace x=y with (x<.y) = (x>.y). On x86, x>.y and x<.y are both y if x=y, so that ensures that, if both are zero, the arguments to the comparison will be the *same* zero. It's the same number of operations as the current implementation (replaces compares with max/min). (On arm, however, <. and >. consider 0 to be greater than _0, so this won't work there; bully for arm.) — Reply to this email directly, view it on GitHub <#156 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEKVAJ3VZPQ632QUVDLRHLDWBN7YNANCNFSM6AAAAAAQ3KPDHY> . You are receiving this because you commented.Message ID: ***@***.***>

Answer 5 · 2022-10-04T01:24:43.000Z

if what's true?

_0 - 0 and _0 + _0 are both _0.

Answer 6 · 2022-10-04T01:29:18.000Z

I was mistaken about that. hhr

…

On Mon, Oct 3, 2022, 9:24 PM moon-chilled ***@***.***> wrote: if what's true? _0 - 0 and _0 + _0 are both _0. — Reply to this email directly, view it on GitHub <#156 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEKVAJ344XZSEGY27OEG3ADWBOBOPANCNFSM6AAAAAAQ3KPDHY> . You are receiving this because you commented.Message ID: ***@***.***>

Answer 7 · 2022-10-05T00:37:45.000Z

Seems x86 gives you -nan for both _+__ and __+_. I guess they want + to be commutative. Annoying.

Answer 8 · 2023-03-06T11:03:28.000Z

Actually, it might be to count ulps. I think this is basically what tolerant comparison does, but need to review the definition and think it over.

Answer 9 · 2023-03-06T11:44:44.000Z

nevermind, not ulp measure because it uses magnitude, not just exponent

Answer 10 · 2023-03-07T06:35:47.000Z

But we can determine an upper and lower ulp bound (tolerant hashing basically derives a slightly coarse upper bound). If most comparands are either below the lower bound (definitely equal) or above the upper bound (definitely unequal), then it may be a win to branch to see if they're in between.

This just works (except for -0?) for comparands of opposite sign, but not for inf. Grumble.