Skip to content

Cover Tree Search

Nearest neighbor search in metric space using hierarchical covers with exponential scale separation.

Cover tree search is a metric space search structure designed for nearest neighbor queries. It builds a hierarchy of points where each level covers the level below with exponentially shrinking distance bounds.

It provides strong theoretical guarantees in spaces with bounded intrinsic dimension.

Problem

Given a set of points SS in a metric space and a query point qq, find:

argminpSd(p,q) \arg\min_{p \in S} d(p, q)

Structure

Each level ii of the tree has a scale:

2i 2^i

Each node satisfies:

propertymeaning
nestingevery point appears in all lower levels
coveringchildren are within bounded distance
separationnodes at same level are well separated

Each node stores a point and pointers to children at the next finer level.

Algorithm

Search maintains a candidate best distance and prunes using scale bounds.

cover_tree_search(node, q, best):
    if node is null:
        return best

    d = distance(q, node.point)
    best = min(best, d)

    if node.is_leaf:
        return best

    for child in node.children:
        if distance(q, child.point) <= best + scale(child.level):
            best = cover_tree_search(child, q, best)

    return best

Example

Points in 2D:

point
(1,1)
(2,2)
(4,4)
(8,8)

Hierarchy:

levelscale
38
24
12
01

Query:

q=(3,3) q = (3,3)

Nearest neighbor:

(2,2) (2,2)

Correctness

The cover tree enforces two invariants:

  • Covering: every point at level ii is within distance 2i2^i of its parent
  • Separation: distinct nodes at level ii are at least 2i2^i apart

These properties guarantee that any point closer than the current best candidate must lie within a reachable subtree under the pruning condition.

Since all candidates that can improve the result are explored, and all pruned subtrees are provably too far to contain a closer point, the algorithm returns the true nearest neighbor.

Complexity

Let nn be number of points and intrinsic dimension be bounded.

casetime
average nearest neighborO(logn)O(\log n)
worst caseO(n)O(n)

Space complexity:

O(n) O(n)

When to Use

Cover trees are useful when:

  • data lies in general metric spaces
  • intrinsic dimension is low or moderate
  • nearest neighbor queries dominate workload
  • triangle inequality holds but coordinates are not available

Compared to k-d trees and ball trees, cover trees provide stronger theoretical guarantees in high dimensional metric spaces.

Implementation

import math

def dist(a, b):
    return math.sqrt(sum((x - y) ** 2 for x, y in zip(a, b)))

class Node:
    def __init__(self, point, level):
        self.point = point
        self.level = level
        self.children = []

def scale(level):
    return 2 ** level

def cover_tree_search(node, q, best=float("inf")):
    if node is None:
        return best

    d = dist(q, node.point)
    if d < best:
        best = d

    for child in node.children:
        if dist(q, child.point) <= best + scale(child.level):
            best = cover_tree_search(child, q, best)

    return best
import "math"

type Point []float64

func dist(a, b Point) float64 {
	sum := 0.0
	for i := range a {
		d := a[i] - b[i]
		sum += d * d
	}
	return math.Sqrt(sum)
}

type Node struct {
	Point    Point
	Level    int
	Children []*Node
}

func scale(level int) float64 {
	return math.Pow(2, float64(level))
}

func CoverTreeSearch(node *Node, q Point, best float64) float64 {
	if node == nil {
		return best
	}

	d := dist(q, node.Point)
	if d < best {
		best = d
	}

	for _, child := range node.Children {
		if dist(q, child.Point) <= best+scale(child.Level) {
			best = CoverTreeSearch(child, q, best)
		}
	}

	return best
}