# Order Statistic Tree Select

# Order Statistic Tree Select

Order Statistic Tree Select finds the element with rank $k$ in a balanced binary search tree. The tree stores one extra field at each node: the size of its subtree.

With subtree sizes, selection becomes a guided walk from the root. At each node, compare $k$ with the size of the left subtree.

## Problem

Given a binary search tree node $root$ and an integer $k$, return the element with zero-based rank $k$ in sorted order.

The rank condition is:

$$
0 \le k < size(root)
$$

## Data Structure

Each node stores:

```id="osts1"
node:
    key
    left
    right
    size
```

where:

```id="osts2"
size(node) = 1 + size(node.left) + size(node.right)
```

For an empty child, size is $0$.

## Algorithm

Let $leftSize$ be the size of the left subtree.

```id="osts3"
select(node, k):
    leftSize = size(node.left)

    if k < leftSize:
        return select(node.left, k)
    else if k == leftSize:
        return node.key
    else:
        return select(node.right, k - leftSize - 1)
```

## Example

Consider the sorted order:

$$
[2, 3, 4, 7, 9]
$$

If $k = 2$, the selected element is:

$$
4
$$

The tree search reaches this value by counting how many elements lie to the left of each visited node.

## Correctness

For any node, all keys in the left subtree appear before the node key, and all keys in the right subtree appear after it.

If $k < leftSize$, the answer lies in the left subtree. If $k = leftSize$, the current node has exactly $k$ smaller elements, so it is the answer. Otherwise, the answer lies in the right subtree with adjusted rank $k - leftSize - 1$.

This preserves the rank target at each recursive step.

## Complexity

| tree type       |        time |
| --------------- | ----------: |
| balanced tree   | $O(\log n)$ |
| unbalanced tree |      $O(n)$ |

Space:

$$
O(h)
$$

for recursion depth, where $h$ is tree height. An iterative implementation uses $O(1)$ extra space.

## When to Use

Use Order Statistic Tree Select when:

* the set changes over time
* you need repeated rank selection
* sorted-array selection would require expensive updates
* both search and rank operations are needed

It is commonly implemented on top of red black trees, AVL trees, treaps, or other balanced search trees.

## Implementation

```id="osts4"
class Node:
    def __init__(self, key):
        self.key = key
        self.left = None
        self.right = None
        self.size = 1

def size(node):
    return node.size if node else 0

def select(root, k):
    node = root

    while node:
        left_size = size(node.left)

        if k < left_size:
            node = node.left
        elif k == left_size:
            return node.key
        else:
            k -= left_size + 1
            node = node.right

    raise IndexError("rank out of range")
```

```id="osts5"
type OSTNode struct {
	Key         int
	Left, Right *OSTNode
	Size        int
}

func nodeSize(x *OSTNode) int {
	if x == nil {
		return 0
	}
	return x.Size
}

func OrderStatisticSelect(root *OSTNode, k int) (int, bool) {
	x := root

	for x != nil {
		leftSize := nodeSize(x.Left)

		if k < leftSize {
			x = x.Left
		} else if k == leftSize {
			return x.Key, true
		} else {
			k -= leftSize + 1
			x = x.Right
		}
	}

	return 0, false
}
```

