Distributed Sample Sort
Sort data across machines by sampling keys, choosing splitters, redistributing records into ordered buckets, and sorting buckets locally.
6 notes
Sort data across machines by sampling keys, choosing splitters, redistributing records into ordered buckets, and sorting buckets locally.
Sort data on a GPU by selecting splitters, partitioning into buckets in parallel, then sorting buckets independently.
Choose splitters from samples, partition the input into buckets, sort buckets independently, then concatenate the sorted buckets.
External-memory sorting algorithm that uses sampling to partition data into balanced buckets, then sorts each bucket independently.
Sample sort specialized for integer keys using range-based partitioning and optional radix-style bucket classification.
Divide and conquer sorting algorithm that uses sampling to partition data into balanced buckets.