Skip to content

Chapter 6. Sorting

Sorting algorithms from fundamental contracts through comparison-based, linear-time, and specialized variants, with selection, external sorting, and correctness analysis.

Sorting is treated here as a family of transformations with precise contracts, rather than as isolated algorithms. The input is a sequence together with a comparison relation. The output is a permutation of that sequence arranged so that the relation holds between consecutive elements. Two properties dominate every design: correctness of ordering and preservation of data through permutation.

The chapter begins by fixing the contract. A sorting procedure must produce a sequence that is ordered and that contains exactly the same multiset of elements as the input. Stability is introduced as an additional constraint: equal keys preserve their relative order. This constraint affects algorithm choice and implementation details, especially when records carry secondary fields.

Elementary algorithms such as insertion sort and selection sort are used to establish invariants and reasoning patterns. They provide clear loop invariants and expose the mechanics of comparison and movement. These forms are not competitive at scale, but they define the baseline for correctness arguments and for small input optimization inside hybrid algorithms.

Divide and conquer appears through merge sort and quick sort. Merge sort provides a stable, predictable structure with guaranteed O(nlogn)O(n \log n) time and linear auxiliary space. Quick sort provides in-place behavior and strong practical performance, with sensitivity to pivot selection and input distribution. The analysis emphasizes partition invariants, recursion structure, and worst-case avoidance strategies.

Heap-based methods introduce an alternative perspective. Heap sort derives ordering from a priority structure and achieves O(nlogn)O(n \log n) time with constant auxiliary space. The tradeoff is reduced locality and loss of stability. This section focuses on the transformation between array representation and heap invariants, and on the cost model of repeated extraction.

Non-comparison sorting methods, including counting sort and radix sort, exploit structure in the key domain. When keys lie in bounded ranges or can be decomposed into digits, these methods achieve linear time under explicit assumptions. The chapter isolates those assumptions and shows how violations degrade correctness or performance.

Partial sorting and selection form a separate class of problems. In many systems, only the smallest kk elements or the median is required. Algorithms such as quickselect and heap-based selection reduce work by avoiding full ordering. The analysis highlights how selection modifies partition logic and how guarantees differ from full sorting.

External sorting addresses data that does not fit in memory. The model shifts from CPU cost to I/O cost. Multi-way merging, run generation, and buffer management determine performance. The treatment focuses on controlling disk access patterns and minimizing passes over data.

Throughout the chapter, comparator design is treated as part of the algorithm. Comparators must define a strict weak ordering. Violations lead to undefined behavior or non-termination. When sorting composite records, comparator cost can dominate runtime, so caching and key extraction become relevant.

Testing and validation are framed in terms of invariants. A sorted output must satisfy local ordering checks and global permutation checks. Adversarial inputs, such as already sorted sequences or sequences with many duplicates, are used to expose instability and worst-case paths.

By the end of the chapter, you should be able to select an appropriate sorting strategy based on constraints, implement it with correct invariants, and reason about its time, space, and stability properties in a way that remains valid when integrated into larger systems.

6.1 Sorting ContractsA sorting algorithm is correct only when its output is both ordered and a permutation of the input — two properties every implementation must preserve.
3 min
6.2 Selection SortSort by repeatedly selecting the minimum element from the unsorted suffix and placing it into the next output position.
4 min
6.3 Insertion SortBuild a sorted prefix one element at a time by inserting each new element into its correct position within the already-sorted portion.
5 min
6.4 Merge SortDivide the input into halves, recursively sort each half, then merge them — combining local order into global order in O(n log n) time.
6 min
6.5 Quick SortPartition around a pivot so smaller elements go left and larger go right, then recursively sort each partition in expected O(n log n) time.
6 min
6.6 Heap SortBuild a max-heap in place, then repeatedly extract the maximum to produce a sorted array in O(n log n) worst-case time.
4 min
6.7 Counting SortSort integer keys from a small range in linear time by counting occurrences and reconstructing the output from those counts.
4 min
6.8 Radix SortSort keys digit by digit using a stable subroutine, achieving linear time for fixed-width integers without any key comparisons.
5 min
6.9 Bucket SortDistribute elements into buckets by value range, sort each bucket, then concatenate — achieving linear expected time on uniformly distributed input.
4 min
6.10 Stable SortingA stable sort preserves the original relative order of equal keys — an extra guarantee required when sorting by secondary fields or compound criteria.
4 min
6.11 Partial SortingProduce only the smallest k elements in sorted order rather than sorting the entire array, reducing unnecessary work when the full order is not needed.
4 min
6.12 Top k SelectionFind the largest or smallest k elements without sorting the full input, using a heap or partition-based approach.
5 min
6.13 QuickselectFind the element at a given rank using quicksort's partition step but recursing into only one side, achieving expected linear time.
5 min
6.14 Median SelectionFind the middle value of a collection in linear time using selection algorithms, without the overhead of a full sort.
6 min
6.15 External SortingSort datasets that exceed main memory by organizing the algorithm around sequential disk access, merge passes, and minimizing I/O operations.
4 min
6.16 Nearly Sorted DataExploit near-sorted structure in inputs like append-only logs or incremental updates to sort in linear or near-linear time.
3 min
6.17 Custom ComparatorsDefine correct comparison relations for user-defined types and non-trivial orderings — consistency requirements that sorting correctness depends on.
4 min
6.18 Sorting RecordsSort structured values by one or more fields while moving the full record, with attention to key extraction, stability, and multi-key ordering.
4 min
6.19 Coordinate CompressionReplace large or sparse keys with small dense ranks that preserve order, making range-based and indexed structures practical on wide-valued data.
4 min
6.20 Inversion CountingCount pairs of elements in the wrong relative order to measure how far an array is from sorted, using a modified merge sort in O(n log n) time.
5 min
6.21 Lower Bounds for SortingProve that any comparison-based sorting algorithm requires Ω(n log n) comparisons in the worst case using a decision tree argument.
3 min
6.22 Parallel SortingDistribute sorting work across multiple processors to reduce wall-clock time, with analysis of total work, span, communication, and synchronization.
4 min
6.23 Testing Sort CorrectnessVerify both the ordering and permutation properties of sorting implementations using randomized, edge-case, and adversarial test strategies.
3 min
6.24 Common BugsIdentify boundary errors, broken invariants, and comparator mistakes that cause sorting implementations to fail on edge cases or duplicate-heavy inputs.
5 min
6.25 Choosing the Right SortSelect the appropriate sorting algorithm based on input size, data characteristics, memory constraints, and required guarantees such as stability or worst-case bounds.
3 min