[FEATURE] shuffle_v2 - see if we can rely on the compiler for `mask`, `maskz`
Opened this issue · 2 comments
At the moment the effort to support maskz
versions of operations is
a) duplicated:
eve/include/eve/detail/shuffle_v2/simd/x86/shuffle_l2.hpp
Lines 518 to 557 in 6f2421b
b) untested (I only concerned myself with explicit names)
c) mask with registercases are not addressed at all.
=====================
I suspect compiler can merge the non masked operation + blend with a masked operation.
So - this needs to be checked for sve and avx512.
Bugs filed if not.
mask(z) logic moved into shuffle_driver.
All the zero handling removed.
Probably somewhere after this function: shuffle_v2_driver_multiple_registers
The tests are split into two files:
test/unit/api/regular/shuffle_v2/shuffle_v2_driver.cpp
test/unit/api/regular/shuffle_v2/shuffle_v2_driver_intergration.cpp
I'm not sure which one to add to at the moment.
You will also need to clean up some P::has_zeroes from
include/eve/detail/shuffle_v2/simd/x86/shuffle_l2.hpp
include/eve/detail/shuffle_v2/simd/arm/sve/shuffle_l2.hpp
Seems like both clang and gcc can do it, at least in some cases. https://godbolt.org/z/h68WxonaT