Faster sorting algorithm
ashvardanian opened this issue · 0 comments
ashvardanian commented
The original sorting algorithm in StringZilla had two passes - Radix and QuickSort. The first one sorted strings by the first 32 bits, and the second operated on chunks with equivalent prefixes. For QuickSort, we use the qsort
function, which has different signatures on different Operating Systems.
#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__)
qsort_s(sequence->order, split, sizeof(sz_size_t), qsort_comparator, (void *)sequence);
qsort_s(sequence->order + split,
sequence->count - split,
sizeof(sz_size_t),
qsort_comparator,
(void *)sequence);
#elif __APPLE__
qsort_r(sequence->order, split, sizeof(sz_size_t), (void *)sequence, qsort_comparator);
qsort_r(sequence->order + split,
sequence->count - split,
sizeof(sz_size_t),
(void *)sequence,
qsort_comparator);
#else
qsort_r(sequence->order, split, sizeof(sz_size_t), qsort_comparator, (void *)sequence);
qsort_r(sequence->order + split,
sequence->count - split,
sizeof(sz_size_t),
qsort_comparator,
(void *)sequence);
#endif
It's ugly and slow, especially compared to the std::sort
in the C++ Standard Templates Library. Let's replace that with an excellent specialized algorithm to achieve higher performance! After some initial testing, I know that doubling the speed is possible, but there may be even more room for improvement!