C++ template library for high performance SIMD based sorting routines for
16-bit, 32-bit and 64-bit data types. The sorting routines are accelerated
using AVX-512/AVX2 when available. The library auto picks the best version
depending on the processor it is run on. If you are looking for the AVX-512 or
AVX2 specific implementations, please see
README file under
src/
directory. The following routines are currently supported:
x86simdsort::qsort(T* arr, size_t size, bool hasnan);
x86simdsort::qselect(T* arr, size_t k, size_t size, bool hasnan);
x86simdsort::partial_qsort(T* arr, size_t k, size_t size, bool hasnan);
Supported datatypes: T
[_Float16, uint16_t, int16_t, float, uint32_t, int32_t, double, uint64_t, int64_t]
x86simdsort::keyvalue_qsort(T1* key, T2* val, size_t size, bool hasnan);
Supported datatypes: T1
, T2
[float, uint32_t, int32_t, double, uint64_t, int64_t]
Note that keyvalue sort is not yet supported for 16-bit
data types.
std::vector<size_t> arg = x86simdsort::argsort(T* arr, size_t size, bool hasnan);
std::vector<size_t> arg = x86simdsort::argselect(T* arr, size_t k, size_t size, bool hasnan);
Supported datatypes: T
[_Float16, uint16_t, int16_t, float, uint32_t, int32_t, double, uint64_t, int64_t]
meson is the used build system. Command to build and install the library:
meson setup --buildtype release builddir && cd builddir
meson compile
sudo meson install
Once installed, you can use pkg-config --cflags --libs x86simdsortcpp
to
populate the right cflags and ldflags to compile and link your C++ program.
This repository also contains a test suite and benchmarking suite which are
written using googletest and google
benchmark frameworks respectively. You
can configure meson to build them both by using -Dbuild_tests=true
and
-Dbuild_benchmarks=true
.
#include "x86simdsort.h"
int main() {
std::vector<float> arr{1000};
x86simdsort::qsort(arr, 1000, true);
return 0;
}
x86simdsort::qsort
is equivalent toqsort
in C orstd::sort
in C++.x86simdsort::qselect
is equivalent tostd::nth_element
in C++ ornp.partition
in NumPy.x86simdsort::partial_qsort
is equivalent tostd::partial_sort
in C++.x86simdsort::argsort
is equivalent tonp.argsort
in NumPy.x86simdsort::argselect
is equivalent tonp.argpartition
in NumPy.
Supported datatypes: uint16_t, int16_t, _Float16, uint32_t, int32_t, float, uint64_t, int64_t, double
. Note that _Float16
will require building this
library with g++ >= 12.x. All the functions have an optional argument bool hasnan
set to false
by default (these are relevant to floating point data
types only). If your array has NAN's, the the behaviour of the sorting routine
is undefined. If hasnan
is set to true, NAN's are always sorted to the end of
the array. In addition to that, qsort will replace all your NAN's with
std::numeric_limits<T>::quiet_NaN
. The original bit-exact NaNs in
the input are not preserved. Also note that the arg methods (argsort and
argselect) will not use the SIMD based algorithms if they detect NAN's in the
array. You can read details of all the implementations
here.