/Multi-Pivot-Introsort

A highly-optimized Introsort.

Primary LanguageC++MIT LicenseMIT

Multi Pivot Introsort

A highly-optimized (un-tuned) Introsort assembled from research papers and open-source libraries, ported into the C++ language.

This sort uses a recursive pattern to re-arrange data in ascending order. Whenever the size of the current interval falls below the threshold of 32 elements, insertion sort is used. On return, the data in the current interval will be in sorted order.

Here is the original Dual-Pivot Quicksort by Vladimir Yaroslavskiy.

Here is the three pivot partition algorithm by Kushagra, Lopez-Ortiz, Munro, and Qiao.

An optimized Dual-Pivot Quicksort by Bloch and Bently can be found in any jdk after version 6.

This sort is derived from the resources listed above.

Optimizations

  1. Use Smart Pivot Selection

    Pivots are carefully selected from a set of five candidates. These candidates are the leftmost element, the element at the first tercile, the middle element, the element at the second tercile, and the rightmost element. All five of these candidates are sorted in place. If none are equal, three pivots are chosen from the middle. If an outside pair is equal, two pivots are selected from the tercile indices. However, if a middle pair is also equal, then a single pivot will be selected from the middle and traditional Quick Sort will be used. This process helps to select pivots that divide the data as evenly as possible, mitigating the possibility of the worst-case O(N2) time complexity. This process also helps to ensure that the partitioning algorithm used is optimal for the sub-array sampled.

  2. Use Four-Way or Three-Way Partitioning When Possible

    Multi-way partitioning has been shown to yeild a higher number of cache hits than traditional two-way partitioning. Multi-way partitioning also helps to trim the height of the sort tree.

  3. Keep the Middle Portion(s) as Small as Possible In Multi-Pivot Partitioning.

    The middle portions may contain elements equal to the pivots in multi-pivot partitioning. Therefore, after partitioning with more than one pivot, if the middle portion comprises more than 2/3 of the current interval, we swap all elements equal to the pivots out of the way before recursively sorting. This helps to boost performance on arrays with many equal elements.

  4. Use Insertion Sort On Little Sub-Arrays

    Insertion sort performs better than O(nlogn) sorts on small arrays where asymptotic complexity matters less and instruction overhead matters more.

  5. Use Jon Bentley's Pair Insertion Sort On Small, Non-Leftmost Partitions.

    Pair insertion sort-- which ignores the lower bound check-- is possible on non-leftmost partitions. Pair insertion sort inserts two elements for each iteration of its outer loop.

  6. Use Heap Sort to prevent Quicksort's Worst-Case Time Complexity.

    This sort is an introspective sort-- a sort that switches to a guaranteed O(nlogn) algorithm whenever time complexity is becoming quadratic. Here, Heap Sort is used to avoid the overhead of auxilliary storage.

Performance

Credits

  • Alejandro Lopez-Ortiz
  • Aurick Qiao
  • David Musser
  • Ellie Moore
  • J. Ian Munro
  • Jon Bently
  • Josh Bloch
  • Shrinu Kushagra
  • Tony Hoare

A Compiler's View