Produce only the smallest k elements in sorted order rather than sorting the entire array, reducing unnecessary work when the full order is not needed.
Partial sorting orders only the part of the array that the caller needs. Instead of fully sorting all n elements, the algorithm produces the smallest k elements in sorted order, or arranges the array so that the first k elements are the desired subset.
This is useful when a full order is wasteful. Search results, leaderboards, nearest-neighbor candidates, and reporting queries often need only the top few entries.
Problem
You have an array A[0..n-1] and an integer k. You want the smallest k elements, preferably faster than sorting the whole array.
Solution
There are two common forms.
The first form returns the smallest k elements sorted:
partial_sort(A, k):
find the smallest k elements
sort those k elements
return themThe second form partitions the array so that the first k elements are the smallest k, but not necessarily sorted:
nth_partition(A, k):
rearrange A so that:
A[0..k-1] contains the k smallest elements
A[k..n-1] contains the remaining elementsThe first form is useful for presentation. The second form is useful as an internal step before later processing.
Heap-Based Partial Sort
A practical method uses a max-heap of size k.
top_k_smallest(A, k):
heap = empty max-heap
for each x in A:
if size(heap) < k:
push x into heap
else if x < max(heap):
pop max(heap)
push x into heap
return elements of heap sorted increasinglyThe heap stores the best k candidates seen so far. The largest element in the heap is the current cutoff. If a new value is greater than or equal to that cutoff, it cannot belong to the smallest k elements.
Invariant
After processing the first i elements, the heap contains the smallest min(i, k) elements among those i elements.
Initially the heap is empty, so the invariant holds.
When fewer than k elements have been processed, the next element must be inserted because all seen elements belong to the current candidate set.
When the heap already has k elements, compare the new element x with the heap maximum. If x is not smaller, the heap remains the smallest k elements. If x is smaller, the current maximum cannot remain in the smallest k, so it is removed and replaced by x.
At the end, the heap contains exactly the smallest k elements of the full array.
Complexity
The heap has size at most k. Each insertion or replacement costs:
O(log k)Processing n elements costs:
O(n log k)Sorting the final heap costs:
O(k log k)Total time:
O(n log k + k log k)When k is much smaller than n, this is better than full sorting:
O(n log n)The extra space usage is:
O(k)Partition-Based Partial Sort
Another method uses quickselect-style partitioning to place the kth smallest element into its final position.
partial_sort_by_partition(A, k):
quickselect(A, 0, length(A), k)
sort A[0..k-1]
return A[0..k-1]After quickselect, every element in A[0..k-1] is less than or equal to every element in A[k..n-1], but the first part may not be sorted. Sorting only that prefix gives the smallest k elements in order.
Expected time with randomized partitioning:
O(n + k log k)Worst-case time without safeguards:
O(n^2)Correctness
For the heap method, correctness follows from the candidate invariant: after every step, the heap stores the best k elements seen so far. Once all elements have been scanned, “seen so far” means the entire array.
For the partition method, quickselect guarantees that no element after index k - 1 is smaller than an element before index k. Sorting the first k elements therefore returns exactly the smallest k elements in sorted order.
Both methods preserve the permutation condition if implemented by moves or swaps. The returned subset contains elements from the input, not newly created values.
Choosing a Method
Use the heap method when data arrives as a stream or when you cannot mutate the input. It keeps only k elements and can emit a result after one scan.
Use the partition method when the input is stored in an array and mutation is allowed. It is usually faster when k is not extremely small.
Use full sorting when k is close to n, because the overhead of special handling may no longer pay for itself.
Common Bugs
A common bug is using a min-heap for the smallest k elements. A min-heap exposes the smallest candidate, but the algorithm needs to compare new elements against the largest accepted candidate. For smallest k, use a max-heap of size k.
Another bug is forgetting that partitioning does not sort the prefix. After quickselect, A[0..k-1] contains the right elements but may appear in arbitrary order.
A third bug is mishandling k = 0 or k > n. A robust implementation should define these boundary cases explicitly.
Takeaway
Partial sorting avoids unnecessary work when only a prefix of the sorted order is needed. Heap-based methods favor streaming and immutability. Partition-based methods favor in-place array processing and lower expected runtime.