/sorting_network_cpp

Generate sorting networks at compile time

Primary LanguageC++GNU General Public License v3.0GPL-3.0

sorting_network_cpp

sorting_network_cpp offers templated implementations of various types of sorting networks which can drastically improve performance for small problem sizes of simple data compared to regular sorting algorithms (e.g. std::sort).

Following listing shows the execution time for different algorithms to sort one million arrays of each 32 random float values on an AMD Ryzen 7 2700x @3.70 GHz (compiled with clang 12.0.0 -O3):

algorithm time (ms) speedup vs std::sort
Batcher Odd Even Merge Sort 74.10 8.4
Bitonic Merge Sort 86.53 7.2
Bose Nelson Sort 73.05 8.6
Bubble Sort 185.58 3.3
Insertion Sort 189.93 3.3
std::sort 628.74 1

An interactive overview of performance measurements for the sorting networks on different data types and problem sizes can be found here.

Usage

A single class sorting_network is available which encapsulates the implementation of the different sorting networks.

Following listing gives a few examples:

#include <sorting_network_cpp/sorting_network.h>

#include <array>
#include <cstdint>

void example()
{
  using namespace quxflux::sorting_net;

  static constexpr std::size_t N = 16;

  {
    std::array<int, N> my_array;
    // fill array...

    sorting_network<N>{}(my_array.begin());
    // my_array is now sorted
  }

  {
    int data[N];
    // fill array...

    // raw pointers work as well...
    sorting_network<N>{}(data);
    // data is now sorted
  }

  {
    int data[N];

    // by default std::less<T> is used as comparator, custom comparators
    // may be passed as follows:

    // custom comparators
    sorting_network<N>{}(data, compare_and_swap<int, std::less<>>{});

    // function objects
    struct predicate
    {
      constexpr bool operator()(const int a, const int b) const { return a < b; }
    };

    sorting_network<N>{}(data, compare_and_swap<int, predicate>{});
  }
}

When no type is explicitly specified as template argument for sorting_network, the type::bose_nelson_sort is used.

Available network types ATM are:

  • insertion_sort
  • bubble_sort
  • bose_nelson_sort
  • batcher_odd_even_merge_sort
  • bitonic_merge_sort
  • size_optimized_sort

Following example shows how to specify a different type than the default one:

quxflux::sorting_net::sorting_network<N, quxflux::sorting_net::type::bitonic_merge_sort>()(std::begin(data_to_be_sorted));

Using custom compare and swap implementations

The compare and swap operation is the fundamental element a sorting network is composed of. The default implementation works well on scalar types, but if you want to specify a custom implementation (e.g. when hardware intrinsics should be used) you may do this by providing a compare and swap functor to the sorting_network::operator() as in following example:

#include <sorting_network_cpp/sorting_network.h>

#include <array>

void example(std::array<float,3>& arr)
{
  quxflux::sorting_net::sorting_network<3>{}(arr.begin(), [](float& a, float& b){
    const auto b_cpy = b;
    b = std::max(a, b);
    a = std::min(a, b_cpy);
  });
}

Single header implementation

A single header implementation is available which allows experimenting with the sorting networks on godbolt.

Requirements

A compiler with C++17 support

Dependencies

For the bare sorting functionality no dependencies are required. If you want to run the tests, the needed dependencies will be installed via conan.

Limitations

  • Depending on the implementation of the comparator the performance advantage of a sorting net compared to a regular sorting algorithm (e.g. std::sort) may diminish or even result in worse performance. This can be seen in the interactive benchmark results overview for the data type Vec2i Z-order which causes in most cases all variants of sorting networks being outperformed by std::sort (see src/benchmark.cpp for the implementation of the aforementioned data type).
  • msvc will fail compiling larger insertion_sort networks in certain (unknown) configurations with fatal error C1202: recursive type or function dependency context too complex

Development notes

  • A pre-push githook is available in .githooks which will automatically create the single header implementation. To enable the hook execute git config --local core.hooksPath .githooks in the repository directory (python required)

References / Acknowledgements

License

GPLv3