Skip to content

VP Tree Search

Nearest neighbor search in metric space using vantage point partitioning.

VP tree search (vantage point tree) organizes points in a metric space by selecting a pivot point and partitioning others based on distance from that pivot.

It is designed for efficient nearest neighbor search using triangle inequality based pruning.

Problem

Given a set of points SS in a metric space and a query point qq, find:

argminpSd(p,q) \arg\min_{p \in S} d(p, q)

Structure

Each node stores:

fieldmeaning
vantage pointpivot point
thresholdmedian distance to partition points
leftpoints closer than threshold
rightpoints farther than threshold

Partition rule:

d(p,vp)μpleft d(p, vp) \le \mu \Rightarrow p \in left d(p,vp)>μpright d(p, vp) > \mu \Rightarrow p \in right

Algorithm

Search maintains the best distance found so far and prunes subtrees using triangle inequality.

vp_search(node, q, best):
    if node is null:
        return best

    d = distance(q, node.vp)
    best = min(best, d)

    if node.is_leaf:
        for p in node.points:
            best = min(best, distance(q, p))
        return best

    if d < node.threshold:
        best = vp_search(node.left, q, best)

        if d + best >= node.threshold:
            best = vp_search(node.right, q, best)
    else:
        best = vp_search(node.right, q, best)

        if d - best <= node.threshold:
            best = vp_search(node.left, q, best)

    return best

Example

Points:

point
(1,1)
(2,2)
(8,8)
(9,9)

Choose vantage point:

vp=(2,2) vp = (2,2)

Threshold splits into:

regionpoints
left(1,1), (2,2)
right(8,8), (9,9)

Query:

q=(2,3) q = (2,3)

Nearest neighbor:

(2,2) (2,2)

Correctness

VP trees rely on the triangle inequality:

d(a,c)d(a,b)+d(b,c) d(a,c) \le d(a,b) + d(b,c)

This allows pruning. If the minimum possible distance from the query point to any point in a subtree exceeds the best known distance, that subtree cannot contain a closer point.

Since all points are either examined directly or excluded only when provably farther, the algorithm returns the true nearest neighbor.

Complexity

Let nn be number of points.

casetime
average nearest neighborO(logn)O(\log n)
worst caseO(n)O(n)

Space complexity:

O(n) O(n)

When to Use

VP trees are useful when:

  • data lies in a metric space
  • distance function is expensive
  • nearest neighbor queries are frequent
  • triangle inequality holds strongly

Compared to k-d trees, VP trees do not depend on coordinate axes and work better in non-Euclidean spaces.

Compared to ball trees, VP trees often use simpler partition logic.

Implementation

import math

def dist(a, b):
    return math.sqrt(sum((x - y) ** 2 for x, y in zip(a, b)))

class Node:
    def __init__(self, vp, threshold):
        self.vp = vp
        self.threshold = threshold
        self.left = None
        self.right = None
        self.points = None

def vp_search(node, q, best=float("inf")):
    if node is None:
        return best

    d = dist(q, node.vp)
    best = min(best, d)

    if node.points is not None:
        for p in node.points:
            best = min(best, dist(q, p))
        return best

    if node.left and node.right:
        if d < node.threshold:
            best = vp_search(node.left, q, best)
            if d + best >= node.threshold:
                best = vp_search(node.right, q, best)
        else:
            best = vp_search(node.right, q, best)
            if d - best <= node.threshold:
                best = vp_search(node.left, q, best)

    return best
import "math"

type Point []float64

func dist(a, b Point) float64 {
	sum := 0.0
	for i := range a {
		d := a[i] - b[i]
		sum += d * d
	}
	return math.Sqrt(sum)
}

type Node struct {
	VP        Point
	Threshold float64
	Left      *Node
	Right     *Node
	Points    []Point
}

func VPSearch(node *Node, q Point, best float64) float64 {
	if node == nil {
		return best
	}

	d := dist(q, node.VP)
	if d < best {
		best = d
	}

	if node.Points != nil {
		for _, p := range node.Points {
			d2 := dist(q, p)
			if d2 < best {
				best = d2
			}
		}
		return best
	}

	if node.Left != nil && node.Right != nil {
		if d < node.Threshold {
			best = VPSearch(node.Left, q, best)
			if d+best >= node.Threshold {
				best = VPSearch(node.Right, q, best)
			}
		} else {
			best = VPSearch(node.Right, q, best)
			if d-best <= node.Threshold {
				best = VPSearch(node.Left, q, best)
			}
		}
	}

	return best
}