Research level implementations of some SIMD algorithms.
Implementation based on the algorithm presented in this blog post
The original paper is here.
std::vector<int32_t> values
= { 0, 4, 6, 20, 40, 60, 90, 155,
1188, 2002, 2244, 2296, 3124, 3226, 3334, 4443,
10, 24, 46, 120, 140, 260, 390, 455,
2188, 3002, 4244, 5296, 6124, 6226, 6334, 6443 };
floki::sort(begin(values),end(values));
Clang 3.4 on Linux
There are some known issues with g++ and Boost SIMD. See the comments about aliasing on this issue. Until this is resolved, it won't compile without setting the following flags.
-fno-strict-aliasing -DBOOST_SIMD_NO_STRICT_ALIASING
Optional unit tests require the Bandit
Intel(R) Core(TM) i7-4770S CPU @ 3.10GHz
compiled with clang 3.4 and flags -O3 -mavx
65536 random elements in std::vector
data type | std::sort | floki::sort |
---|---|---|
int32_t | 2.99ms | 0.92ms |
float | 3.33ms | 1.10ms |
There is a CMake build file included that builds the benchmark and unit tests.
I generate the make file using clang as follows
mkdir build-floki
cd build-floki
CXX=/usr/bin/clang++-3.4 cmake ~/source/floki -DCMAKE_CXX_FLAGS="-march=native"
make
If Boost SIMD is installed to a path other than /usr/local, then set BoostSIMD_INCLUDE_DIR accordingly.