linbox-team/fflas-ffpack

greater method of simd structs for unsigned integral types is wrong when comparing to 0

Closed this issue · 4 comments

For unsigned integral types, the method greater of simd struct return wrong results when comparing to 0.

For example, the following code, when compiled on a machine supporting AVX2 returns a wrong result

#include "fflas-ffpack/fflas-ffpack-config.h"
#include "fflas-ffpack/fflas-ffpack.h"

using S = Simd256<uint64_t>;
using scalar_t = S::scalar_t;

void
pprint (const typename S::vect_t v, const char *varname)
{
  scalar_t array[S::vect_size];
  S::storeu (array, v);
  printf ("%8s =", varname);
  for (unsigned int i = 0; i < S::vect_size;i++)
    printf (" 0x%016lx", array[i]);
  printf ("\n");
}

int
main (int argc, char *argv[])
{
  typename S::vect_t a, b, t;
  a = S::set (0, 1, 0x8000000000000001, 0x7fffffffffffffff);
  b = S::set1 (0);
  pprint (a, "a");
  pprint (b, "b");
  t = S::greater (a, b);
  pprint (t, "a > b ?");

  return 0;
}

It outputs

       a = 0x0000000000000000 0x0000000000000001 0x8000000000000001 0x7fffffffffffffff
       b = 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000
 a > b ? = 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000

instead of

       a = 0x0000000000000000 0x0000000000000001 0x8000000000000001 0x7fffffffffffffff
       b = 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000
 a > b ? = 0x0000000000000000 0xffffffffffffffff 0xffffffffffffffff 0xffffffffffffffff

The problem comes from the implementation of greater in simd*_int*.inl files. For unsigned type (for which an intrisinc does not exist), it is implemented as

    static INLINE CONST vect_t greater(vect_t a, vect_t b) {
        vect_t x;
        x = set1(-(static_cast<scalar_t>(1) << (sizeof(scalar_t) * 8 - 1)));
        a = sub(x, a);
        b = sub(x, b);
        return _mm256_cmpgt_epi64(b,a);
    }
P1K commented

For that exact example, changing the function to the (more reliable?) variant

static INLINE CONST vect_t greater(vect_t a, vect_t b) {
        vect_t x;
        x = set1((static_cast<scalar_t>(1) << (sizeof(scalar_t) * 8 - 1)));
        a = vxor(x, a);
        b = vxor(x, b);
        return _mm256_cmpgt_epi64(a,b);
    }

fixes the problem
(By the way, does it really make sense not to have set1(1ULL << 63) since there is a hard-coded _mm256_cmpgt_epi64 anyways?)

Thanks ! I will create a PR with this change for all simd files.

P1K commented

Welp, I hope it does not introduce other bugs tho; I haven't tested it. (But I found it in the Hacker's delight, so it's not something out of my hat)

Welp, I hope it does not introduce other bugs tho; I haven't tested it. (But I found it in the Hacker's delight, so it's not something out of my hat)

I added tests for the special case of the comparison with zero (they were already tests for the generic case)