WojciechMula/simd-sort

Another AVX2 version

simonlindholm opened this issue · 4 comments

In case you're interested: https://gist.github.com/simonlindholm/7f15e1c4d61813eeaa99ff71f1ada1a0

In my tests (random data, n=1e7) it runs 2.5 times faster than avx2-quicksort.cpp, albeit at the cost of allocating external memory. (Apologies for the unreadableness, it's written in relation to a competitive programming course.)

@simonlindholm That's cool, congratulations! Did you test it against ascending, descending data and a small number of distinct values? We learnt (@lemire and I) that for these special cases vectorized algorithms perform worse than scalar ones.

These are the sort of times I see:

Input std::sort mine avx2-quicksort.cpp
random 0m20.209s 0m7.631s 0m19.476s
increasing 0m4.548s 0m4.509s 0m2.937s
decreasing 0m3.633s 0m4.550s 0m3.843s
rand()%10 0m6.654s 0m1.431s >10 min
i%(N/10) 0m10.227s 0m4.275s 0m13.742s

(For n = 1e7, sums of 20 sortings, on an i5-4200U, compiled with g++-5.4 -std=c++11 -O3 -march=native.)

@simonlindholm These are interesting results.

@simonlindholm Great results!