Super Scalar Sample Sort

7 years 11 months ago
Super Scalar Sample Sort
Sample sort, a generalization of quicksort that partitions the input into many pieces, is known as the best practical comparison based sorting algorithm for distributed memory parallel computers. We show that sample sort is also useful on a single processor. The main algorithmic insight is that element comparisons can be decoupled from expensive conditional branching using predicated instructions. This transformation facilitates optimizations like loop unrolling and software pipelining. The final implementation, albeit cache efficient, is limited by a linear number of memory accesses rather than the O(n log n) comparisons. On an Itanium 2 machine, we obtain a speedup of up to 2 over std::sort from the GCC STL library, which is known as one of the fastest available quicksort implementations.
Peter Sanders, Sebastian Winkel
Added 01 Jul 2010
Updated 01 Jul 2010
Type Conference
Year 2004
Where ESA
Authors Peter Sanders, Sebastian Winkel
Comments (0)