Improve argsort for 32-bit
Opened this issue · 0 comments
r-devulap commented
32-bit argsort uses ymm registers: we can switch to zmm registers (use 2x i64gather instructions) and add new bitonic networks.
Opened this issue · 0 comments
32-bit argsort uses ymm registers: we can switch to zmm registers (use 2x i64gather instructions) and add new bitonic networks.