# VP Tree Search

# VP Tree Search

VP tree search (vantage point tree) organizes points in a metric space by selecting a pivot point and partitioning others based on distance from that pivot.

It is designed for efficient nearest neighbor search using triangle inequality based pruning.

## Problem

Given a set of points $S$ in a metric space and a query point $q$, find:

$$
\arg\min_{p \in S} d(p, q)
$$

## Structure

Each node stores:

| field         | meaning                             |
| ------------- | ----------------------------------- |
| vantage point | pivot point                         |
| threshold     | median distance to partition points |
| left          | points closer than threshold        |
| right         | points farther than threshold       |

Partition rule:

$$
d(p, vp) \le \mu \Rightarrow p \in left
$$

$$
d(p, vp) > \mu \Rightarrow p \in right
$$

## Algorithm

Search maintains the best distance found so far and prunes subtrees using triangle inequality.

```text id="vp0k7q"
vp_search(node, q, best):
    if node is null:
        return best

    d = distance(q, node.vp)
    best = min(best, d)

    if node.is_leaf:
        for p in node.points:
            best = min(best, distance(q, p))
        return best

    if d < node.threshold:
        best = vp_search(node.left, q, best)

        if d + best >= node.threshold:
            best = vp_search(node.right, q, best)
    else:
        best = vp_search(node.right, q, best)

        if d - best <= node.threshold:
            best = vp_search(node.left, q, best)

    return best
```

## Example

Points:

| point |
| ----- |
| (1,1) |
| (2,2) |
| (8,8) |
| (9,9) |

Choose vantage point:

$$
vp = (2,2)
$$

Threshold splits into:

| region | points       |
| ------ | ------------ |
| left   | (1,1), (2,2) |
| right  | (8,8), (9,9) |

Query:

$$
q = (2,3)
$$

Nearest neighbor:

$$
(2,2)
$$

## Correctness

VP trees rely on the triangle inequality:

$$
d(a,c) \le d(a,b) + d(b,c)
$$

This allows pruning. If the minimum possible distance from the query point to any point in a subtree exceeds the best known distance, that subtree cannot contain a closer point.

Since all points are either examined directly or excluded only when provably farther, the algorithm returns the true nearest neighbor.

## Complexity

Let $n$ be number of points.

| case                     | time        |
| ------------------------ | ----------- |
| average nearest neighbor | $O(\log n)$ |
| worst case               | $O(n)$      |

Space complexity:

$$
O(n)
$$

## When to Use

VP trees are useful when:

* data lies in a metric space
* distance function is expensive
* nearest neighbor queries are frequent
* triangle inequality holds strongly

Compared to k-d trees, VP trees do not depend on coordinate axes and work better in non-Euclidean spaces.

Compared to ball trees, VP trees often use simpler partition logic.

## Implementation

```python id="vp1m8v"
import math

def dist(a, b):
    return math.sqrt(sum((x - y) ** 2 for x, y in zip(a, b)))

class Node:
    def __init__(self, vp, threshold):
        self.vp = vp
        self.threshold = threshold
        self.left = None
        self.right = None
        self.points = None

def vp_search(node, q, best=float("inf")):
    if node is None:
        return best

    d = dist(q, node.vp)
    best = min(best, d)

    if node.points is not None:
        for p in node.points:
            best = min(best, dist(q, p))
        return best

    if node.left and node.right:
        if d < node.threshold:
            best = vp_search(node.left, q, best)
            if d + best >= node.threshold:
                best = vp_search(node.right, q, best)
        else:
            best = vp_search(node.right, q, best)
            if d - best <= node.threshold:
                best = vp_search(node.left, q, best)

    return best
```

```go id="vp2n7x"
import "math"

type Point []float64

func dist(a, b Point) float64 {
	sum := 0.0
	for i := range a {
		d := a[i] - b[i]
		sum += d * d
	}
	return math.Sqrt(sum)
}

type Node struct {
	VP        Point
	Threshold float64
	Left      *Node
	Right     *Node
	Points    []Point
}

func VPSearch(node *Node, q Point, best float64) float64 {
	if node == nil {
		return best
	}

	d := dist(q, node.VP)
	if d < best {
		best = d
	}

	if node.Points != nil {
		for _, p := range node.Points {
			d2 := dist(q, p)
			if d2 < best {
				best = d2
			}
		}
		return best
	}

	if node.Left != nil && node.Right != nil {
		if d < node.Threshold {
			best = VPSearch(node.Left, q, best)
			if d+best >= node.Threshold {
				best = VPSearch(node.Right, q, best)
			}
		} else {
			best = VPSearch(node.Right, q, best)
			if d-best <= node.Threshold {
				best = VPSearch(node.Left, q, best)
			}
		}
	}

	return best
}
```

