Find the element at a given rank using quicksort's partition step but recursing into only one side, achieving expected linear time.
Quickselect finds the element that would appear at a given index if the array were sorted. It uses the same partitioning idea as quick sort, but it recurses into only one side. This makes it useful for medians, percentiles, and top k boundaries.
Problem
You have an array A[0..n-1] and an index k. You want the element with rank k, using zero-based indexing, without fully sorting the array.
After quickselect finishes, A[k] is the same value that would appear at index k in the fully sorted array.
Solution
Partition the array around a pivot. If the pivot lands at index k, return it. If the pivot lands before k, continue in the right partition. If it lands after k, continue in the left partition.
quickselect(A, lo, hi, k):
while hi - lo > 1:
p = partition(A, lo, hi)
if p == k:
return A[p]
else if p < k:
lo = p + 1
else:
hi = p
return A[lo]The range [lo, hi) is half open. The index k is an absolute index into A, not an offset relative to lo.
Partition Contract
The partition procedure must place one pivot element into its final sorted position p and satisfy:
A[lo..p-1] contains elements <= A[p]
A[p+1..hi-1] contains elements >= A[p]The elements on each side do not need to be sorted. Quickselect only needs to know which side contains the desired rank.
Example
Let:
A = [7, 1, 5, 9, 2, 8, 3]
k = 3The value at sorted index 3 is the fourth-smallest element.
The fully sorted array would be:
[1, 2, 3, 5, 7, 8, 9]so the answer should be 5.
Quickselect does not build this full array. It partitions once, checks where the pivot landed, and then discards the half that cannot contain index 3.
For example, if the pivot is 3, partition may produce:
[1, 2, 3, 9, 7, 8, 5]The pivot index is 2. Since 2 < 3, the desired element is in the right range.
Now continue on:
[9, 7, 8, 5]If the next pivot lands at index 3, the algorithm returns 5.
Invariant
At the start of each loop iteration, the element with global rank k lies inside the active range A[lo..hi-1].
Initially, the active range is the whole array, so the invariant holds.
After partitioning, the pivot is in its final sorted position p.
If p = k, the pivot is the answer.
If p < k, then all elements at positions lo..p are too small to have rank k, so the answer must lie in A[p+1..hi-1].
If p > k, then all elements at positions p..hi-1 are too large to have rank k, so the answer must lie in A[lo..p-1].
Each step preserves the invariant while shrinking the active range.
Correctness
Quickselect is correct because partitioning gives rank information. The algorithm does not need full order; it only needs to know whether the desired rank lies left of the pivot, at the pivot, or right of the pivot.
The permutation condition holds because partitioning uses swaps. Every recursive or iterative step rearranges the input without changing its multiset of elements.
When the loop terminates with a single active element, the invariant says that the rank k element lies in that one-element range. Therefore that element is the answer.
Complexity
A partition pass over a range of length m costs:
O(m)If pivots split the input reasonably well, the total expected work is linear:
O(n)The worst case occurs when the pivot is always the smallest or largest remaining element. Then the active range shrinks by only one element per pass:
O(n^2)Randomized pivot selection gives expected linear time by making consistently bad pivots unlikely.
Median Selection
For an odd number of elements, the median is the element at:
k = floor(n / 2)For an even number of elements, there are two common conventions:
lower median: k = n/2 - 1
upper median: k = n/2
average median: average of both middle valuesThe average median requires selecting two values, or selecting one and then finding the neighbor.
Top k Boundary
Quickselect is often used to find the top k boundary.
For the smallest k elements, select index k or k - 1, depending on the desired partition convention.
smallest_k(A, k):
quickselect(A, 0, length(A), k)
return A[0..k-1]After selection, the prefix contains the smallest k elements, but not in sorted order. Sort the prefix if ordered output is required.
For the largest k elements, select index n - k:
largest_k(A, k):
target = length(A) - k
quickselect(A, 0, length(A), target)
return A[target..n-1]Duplicates
Duplicates require careful interpretation. If many elements equal the pivot, a two-way partition may do unnecessary work. Three-way partitioning improves this by producing:
elements < pivot
elements = pivot
elements > pivotThen quickselect can stop immediately when k falls inside the equal region.
three_way_select(A, lo, hi, k):
while hi - lo > 1:
lt, gt = three_way_partition(A, lo, hi)
if k < lt:
hi = lt
else if k >= gt:
lo = gt
else:
return A[k]
return A[lo]This is especially useful when the array contains many repeated keys.
Implementation Notes
Use random pivoting or median-of-three pivoting. The simplest version with the last element as pivot is correct but fragile.
Prefer the iterative version to avoid recursion-depth concerns.
Validate k before selection:
if k < 0 or k >= length(A):
error "rank out of range"Remember that quickselect mutates the array. If the caller needs the original order preserved, copy the input first or use a heap-based method.
Common Bugs
A common bug is treating k as relative to the active range after updating lo. Keep k as an absolute index unless the whole implementation is explicitly written with relative ranks.
Another bug is returning the prefix after quickselect and assuming it is sorted. Quickselect partitions; it does not sort.
A third bug is recursing into a range that still includes the pivot. If p is the final pivot index, the next range should exclude p.
A fourth bug is failing on duplicate-heavy input because partitioning does not shrink the problem enough. Three-way partitioning avoids this.
Takeaway
Quickselect finds a rank without sorting the whole array. It is the right primitive for medians, percentiles, and top k boundaries when mutation is acceptable and expected linear time is desired.