Skip to content

11. Parallel, Distributed, and GPU Sorting

High-throughput sorting: parallel merge and quicksort, GPU bitonic and radix sort, SIMD sorting networks, and distributed partition-based sorts.

indexslugname
311parallel-merge-sortParallel Merge Sort
312parallel-quicksortParallel Quicksort
313parallel-sample-sortParallel Sample Sort
314parallel-radix-sortParallel Radix Sort
315parallel-bitonic-sortParallel Bitonic Sort
316parallel-odd-even-sortParallel Odd Even Sort
317parallel-counting-sortParallel Counting Sort
318gpu-bitonic-sortGPU Bitonic Sort
319gpu-radix-sortGPU Radix Sort
320gpu-merge-sortGPU Merge Sort
321gpu-sample-sortGPU Sample Sort
322cuda-warp-sortCUDA Warp Sort
323cuda-block-sortCUDA Block Sort
324simd-bitonic-sortSIMD Bitonic Sort
325simd-sorting-networkSIMD Sorting Network
326bitonic-sorting-networkBitonic Sorting Network
327odd-even-sorting-networkOdd Even Sorting Network
328bitonic-merge-networkBitonic Merge Network
329mapreduce-sortMapReduce Sort
330terasortTeraSort
331distributed-sample-sortDistributed Sample Sort
332distributed-range-partition-sortDistributed Range Partition Sort
333psrsParallel Sorting by Regular Sampling
334bitonic-sort-on-hypercubeBitonic Sort on Hypercube
335shear-sortShear Sort
Parallel Merge SortDivide the input, sort subarrays in parallel, then merge them using parallel merge procedures.
3 min
Parallel QuicksortPartition the array around a pivot, then sort partitions concurrently using parallel recursion.
3 min
Parallel Sample SortChoose splitters from samples, partition the input into buckets, sort buckets independently, then concatenate the sorted buckets.
3 min
Parallel Radix SortSort integer keys by processing fixed width digit groups in parallel using counting and prefix sums.
4 min
Parallel Bitonic SortSort data by building and merging bitonic sequences through a regular compare and exchange network.
4 min
Parallel Odd Even SortSort an array by repeatedly comparing odd indexed and even indexed adjacent pairs in parallel.
3 min
Parallel Counting SortCount key frequencies in parallel, compute prefix sums, then scatter elements into sorted positions.
3 min
GPU Bitonic SortSort data on a GPU using a regular bitonic compare and exchange network.
3 min
GPU Radix SortSort fixed width keys on a GPU using digit passes, parallel histograms, prefix sums, and scatter operations.
4 min
GPU Merge SortSort data on a GPU by recursively merging sorted runs using parallel merge primitives.
3 min
GPU Sample SortSort data on a GPU by selecting splitters, partitioning into buckets in parallel, then sorting buckets independently.
3 min
CUDA Warp SortSort a small set of values inside one CUDA warp using shuffle operations and compare exchange steps.
3 min
CUDA Block SortSort a block sized tile on the GPU using shared memory and cooperative threads.
3 min
SIMD Bitonic SortSort small vectors using SIMD registers with a fixed bitonic compare exchange network.
3 min
SIMD Sorting NetworkSort a fixed size vector by mapping a data independent sorting network to SIMD compare, shuffle, and blend operations.
3 min
Bitonic Sorting NetworkSort a fixed size sequence with a bitonic compare and exchange network whose schedule is independent of input values.
4 min
Odd Even Sorting NetworkSort a fixed size sequence using a data independent network of odd even compare exchange stages.
3 min
Bitonic Merge NetworkMerge a bitonic sequence into a sorted sequence using a fixed comparator network.
3 min
MapReduce SortSort large distributed datasets by partitioning records by key range, sorting partitions locally, and writing ordered output shards.
3 min
TeraSortSort very large datasets with range partitioning, distributed shuffle, and reducer local sorting.
3 min
Distributed Sample SortSort data across machines by sampling keys, choosing splitters, redistributing records into ordered buckets, and sorting buckets locally.
3 min
Distributed Range Partition SortSort distributed data by assigning fixed key ranges to workers, routing records by range, then sorting locally.
3 min
PSRSSort data in parallel by local sorting, regular sampling, global splitter selection, redistribution, and final local merge.
3 min
Bitonic Sort on HypercubeSort distributed keys across hypercube processors using dimension based compare and exchange stages.
4 min
Shear SortSort a two dimensional mesh by alternating row sorts and column sorts until the grid is globally ordered.
4 min