Skip to content

Top K by Quickselect

Find the k largest elements using partition-based selection in linear time.

Top K by Quickselect finds the kk largest elements by partitioning the array so that the top kk elements occupy one side. It uses the same idea as Quickselect, but returns a set instead of a single element.

This method is faster than heap-based approaches when the data is in memory and only a single pass with partitioning is needed.

Problem

Given an array AA of length nn and an integer kk, return the kk largest elements of AA.

The output does not need to be sorted unless required.

Algorithm

Find the (nk)(n - k)-th smallest element using Quickselect, then take all elements greater than or equal to it.

top_k_by_quickselect(A, k):
    n = length(A)
    target = n - k

    nth_element(A, target)

    return A[target..n-1]

If sorted output is required, sort the resulting segment.

Example

Let

A=[7,2,9,4,3] A = [7, 2, 9, 4, 3]

and

k=3 k = 3

We compute:

target=53=2 target = 5 - 3 = 2

After partition:

[2,3,4,9,7] [2, 3, 4, 9, 7]

Take:

[4,9,7] [4, 9, 7]

These are the top 3 elements.

Correctness

After applying Nth Element at index nkn - k, all elements to the right are greater than or equal to A[nk]A[n-k]. Therefore they represent the kk largest elements.

The partition property ensures correctness without full sorting.

Complexity

casetime
averageO(n)O(n)
worstO(n2)O(n^2)

With randomized pivot or introselect fallback:

O(n) O(n)

Sorting the final kk elements adds:

O(klogk) O(k \log k)

Space:

O(1) O(1)

When to Use

Use this method when:

  • data fits in memory
  • you want optimal average performance
  • ordering of the top kk is not required or can be added later

For streaming data or strict memory bounds, heap-based methods are more appropriate.

Implementation

import random

def top_k_by_quickselect(a, k):
    def partition(left, right, pivot_index):
        pivot = a[pivot_index]
        a[pivot_index], a[right] = a[right], a[pivot_index]
        store = left
        for i in range(left, right):
            if a[i] < pivot:
                a[store], a[i] = a[i], a[store]
                store += 1
        a[right], a[store] = a[store], a[right]
        return store

    def nth_element(left, right, k):
        while True:
            if left == right:
                return

            pivot_index = random.randint(left, right)
            pivot_index = partition(left, right, pivot_index)

            if pivot_index == k:
                return
            elif k < pivot_index:
                right = pivot_index - 1
            else:
                left = pivot_index + 1

    n = len(a)
    target = n - k
    nth_element(0, n - 1, target)

    return a[target:]
import "math/rand"

func TopKByQuickselect(a []int, k int) []int {
	var partition func([]int, int, int, int) int

	partition = func(a []int, left, right, pivotIndex int) int {
		pivot := a[pivotIndex]
		a[pivotIndex], a[right] = a[right], a[pivotIndex]

		store := left
		for i := left; i < right; i++ {
			if a[i] < pivot {
				a[store], a[i] = a[i], a[store]
				store++
			}
		}
		a[right], a[store] = a[store], a[right]
		return store
	}

	var nthElement func(int, int, int)

	nthElement = func(left, right, k int) {
		for {
			if left == right {
				return
			}

			pivotIndex := left + rand.Intn(right-left+1)
			pivotIndex = partition(a, left, right, pivotIndex)

			if pivotIndex == k {
				return
			} else if k < pivotIndex {
				right = pivotIndex - 1
			} else {
				left = pivotIndex + 1
			}
		}
	}

	n := len(a)
	target := n - k
	nthElement(0, n-1, target)

	return a[target:]
}