Skip to content

Order Statistic Tree Select

Select the element with a given rank from a balanced search tree augmented with subtree sizes.

Order Statistic Tree Select finds the element with rank kk in a balanced binary search tree. The tree stores one extra field at each node: the size of its subtree.

With subtree sizes, selection becomes a guided walk from the root. At each node, compare kk with the size of the left subtree.

Problem

Given a binary search tree node rootroot and an integer kk, return the element with zero-based rank kk in sorted order.

The rank condition is:

0k<size(root) 0 \le k < size(root)

Data Structure

Each node stores:

node:
    key
    left
    right
    size

where:

size(node) = 1 + size(node.left) + size(node.right)

For an empty child, size is 00.

Algorithm

Let leftSizeleftSize be the size of the left subtree.

select(node, k):
    leftSize = size(node.left)

    if k < leftSize:
        return select(node.left, k)
    else if k == leftSize:
        return node.key
    else:
        return select(node.right, k - leftSize - 1)

Example

Consider the sorted order:

[2,3,4,7,9] [2, 3, 4, 7, 9]

If k=2k = 2, the selected element is:

4 4

The tree search reaches this value by counting how many elements lie to the left of each visited node.

Correctness

For any node, all keys in the left subtree appear before the node key, and all keys in the right subtree appear after it.

If k<leftSizek < leftSize, the answer lies in the left subtree. If k=leftSizek = leftSize, the current node has exactly kk smaller elements, so it is the answer. Otherwise, the answer lies in the right subtree with adjusted rank kleftSize1k - leftSize - 1.

This preserves the rank target at each recursive step.

Complexity

tree typetime
balanced treeO(logn)O(\log n)
unbalanced treeO(n)O(n)

Space:

O(h) O(h)

for recursion depth, where hh is tree height. An iterative implementation uses O(1)O(1) extra space.

When to Use

Use Order Statistic Tree Select when:

  • the set changes over time
  • you need repeated rank selection
  • sorted-array selection would require expensive updates
  • both search and rank operations are needed

It is commonly implemented on top of red black trees, AVL trees, treaps, or other balanced search trees.

Implementation

class Node:
    def __init__(self, key):
        self.key = key
        self.left = None
        self.right = None
        self.size = 1

def size(node):
    return node.size if node else 0

def select(root, k):
    node = root

    while node:
        left_size = size(node.left)

        if k < left_size:
            node = node.left
        elif k == left_size:
            return node.key
        else:
            k -= left_size + 1
            node = node.right

    raise IndexError("rank out of range")
type OSTNode struct {
	Key         int
	Left, Right *OSTNode
	Size        int
}

func nodeSize(x *OSTNode) int {
	if x == nil {
		return 0
	}
	return x.Size
}

func OrderStatisticSelect(root *OSTNode, k int) (int, bool) {
	x := root

	for x != nil {
		leftSize := nodeSize(x.Left)

		if k < leftSize {
			x = x.Left
		} else if k == leftSize {
			return x.Key, true
		} else {
			k -= leftSize + 1
			x = x.Right
		}
	}

	return 0, false
}