intel/x86-simd-sort

Improve vector FP16 comparison function

Opened this issue · 0 comments

I suspect this function

static opmask_t ge(zmm_t x, zmm_t y)
can be improved with fewer operations. See:
https://github.com/numpy/numpy/blob/0bd56e7ec12f8ceeb8d082340e71e60b873d5c57/numpy/core/src/npysort/npysort_common.h#L153 for reference.