15.5 Closest Pair of Points
Given \(n\) points on a two-dimensional plane, find the pair of points whose Euclidean distance is minimal.
15.5 Closest Pair of Points
Problem
Given (n) points on a two-dimensional plane, find the pair of points whose Euclidean distance is minimal.
For example:
| Point | Coordinates |
|---|---|
| A | (2, 3) |
| B | (12, 30) |
| C | (40, 50) |
| D | (5, 1) |
| E | (12, 10) |
| F | (3, 4) |
The closest pair is:
A(2,3)
F(3,4)
with distance:
$$ \sqrt{(3-2)^2+(4-3)^2} $$
A straightforward solution compares every pair of points.
With (n) points:
$$ \binom{n}{2} = \frac{n(n-1)}{2} $$
comparisons are required.
Time complexity:
$$ O(n^2) $$
For millions of points, this becomes prohibitively expensive.
Can divide and conquer do better?
Solution
Yes.
The closest-pair problem can be solved in:
$$ O(n\log n) $$
using divide and conquer.
The key observation is that after solving the left half and right half independently, only a small number of cross-boundary candidates must be examined.
This dramatically reduces the search space.
Brute Force Baseline
The naive approach:
func ClosestPair(points []Point) float64 {
best := math.Inf(1)
for i := 0; i < len(points); i++ {
for j := i + 1; j < len(points); j++ {
best = min(best,
distance(points[i], points[j]))
}
}
return best
}
Complexity:
$$ O(n^2) $$
This serves as the baseline.
Our divide-and-conquer algorithm must improve upon it.
Divide Step
Sort points by x-coordinate:
(1,8)
(2,4)
(3,7)
(4,2)
(6,6)
(8,1)
(9,5)
(10,3)
Split at the median:
Left:
(1,8)
(2,4)
(3,7)
(4,2)
Right:
(6,6)
(8,1)
(9,5)
(10,3)
Each side contains approximately:
$$ n/2 $$
points.
Recursive Step
Recursively compute:
dL = closest pair in left half
dR = closest pair in right half
Let:
$$ d=\min(d_L,d_R) $$
At this point we know:
- No pair inside the left side is closer than (d).
- No pair inside the right side is closer than (d).
Only one possibility remains:
A closer pair may cross the dividing line.
The Critical Observation
Suppose:
$$ d=5 $$
Any point farther than 5 units from the dividing line cannot participate in a better cross-boundary solution.
Therefore we only need to examine points inside a vertical strip:
| median
|
<---- d ---->
The strip width is:
$$ 2d $$
All candidate points must lie inside this region.
This reduces the search space dramatically.
Strip Construction
Collect all points satisfying:
$$ |x - x_{mid}| < d $$
Example:
xmid
|
|
* |
| *
* |
|
* *
Only these points require additional consideration.
Every other point is guaranteed to be too far away.
Sort Strip by Y Coordinate
Inside the strip, sort by y-coordinate:
(4,2)
(10,3)
(2,4)
(9,5)
(6,6)
(3,7)
(1,8)
This ordering creates a powerful geometric constraint.
Why Only Seven Comparisons?
This is the heart of the algorithm.
Suppose we inspect a point:
P
and look upward in y-order.
A geometric packing argument shows:
A point can have at most seven relevant successors.
Checking beyond seven points cannot produce a smaller distance.
Therefore:
for each point
compare with next 7 points
is sufficient.
Not:
compare with all remaining points
This converts the strip processing cost from quadratic to linear.
Strip Processing
Pseudo-code:
best = d
for each point i
for next 7 points j
best = min(best,
distance(i,j))
Complexity:
$$ O(n) $$
for the entire strip.
This is the key breakthrough.
Complete Recurrence
The algorithm performs:
- Divide into two halves.
- Solve left half.
- Solve right half.
- Process strip.
Recurrence:
$$ T(n)=2T(n/2)+O(n) $$
From the Master Theorem:
$$ T(n)=O(n\log n) $$
This is a dramatic improvement over:
$$ O(n^2) $$
High-Level Implementation
func Closest(points []Point) float64 {
sortByX(points)
return solve(points)
}
Recursive core:
func solve(points []Point) float64 {
n := len(points)
if n <= 3 {
return bruteForce(points)
}
mid := n / 2
left := solve(points[:mid])
right := solve(points[mid:])
d := min(left, right)
strip := buildStrip(points, d)
return min(d,
closestInStrip(strip, d))
}
Production implementations maintain both x-sorted and y-sorted arrays to avoid repeated sorting.
Correctness Intuition
Assume the true closest pair exists.
Three possibilities exist:
Both Points in Left Half
Found by recursive solution.
Both Points in Right Half
Found by recursive solution.
One Point on Each Side
Must lie inside the strip.
Found during strip processing.
No other possibility exists.
Therefore the algorithm always finds the global optimum.
Geometric Insight
Many divide-and-conquer algorithms spend most of their effort combining solutions.
Closest-pair succeeds because geometry dramatically limits what can happen across the boundary.
The strip appears to contain many points, but geometric packing prevents dense clustering beyond a small constant factor.
This hidden geometric structure is what transforms a quadratic problem into an (O(n \log n)) algorithm.
Complexity Analysis
| Operation | Cost |
|---|---|
| Divide | (O(1)) |
| Recursive calls | (2T(n/2)) |
| Build strip | (O(n)) |
| Strip scan | (O(n)) |
Recurrence:
$$ T(n)=2T(n/2)+O(n) $$
Result:
$$ T(n)=O(n\log n) $$
Space:
$$ O(n) $$
Common Mistakes
Re-sorting at Every Level
Sorting inside every recursive call produces:
$$ O(n\log^2 n) $$
instead of:
$$ O(n\log n) $$
Maintain sorted structures throughout recursion.
Comparing Too Many Strip Points
The geometric proof allows only a constant number of comparisons.
Comparing every pair inside the strip destroys performance.
Incorrect Median Handling
Points near the dividing line must be included carefully.
Off-by-one errors frequently miss valid candidates.
Forgetting Small Inputs
For:
n ≤ 3
brute force is simpler and often faster.
Discussion
The closest-pair algorithm is one of the most elegant examples of divide and conquer. Unlike sorting algorithms, where the benefits are immediately visible, the improvement here comes from exploiting geometric structure. The recursive decomposition alone is insufficient; the crucial insight is that only a tiny subset of cross-boundary pairs can possibly improve the solution.
This pattern appears repeatedly in advanced algorithms. A problem that initially seems to require examining all pairwise interactions often contains hidden constraints that drastically reduce the number of meaningful candidates. Recognizing and exploiting those constraints is one of the defining skills of algorithm design.
The next sections continue this theme by applying divide-and-conquer techniques to arithmetic itself, beginning with Karatsuba's multiplication algorithm, which breaks the long-standing quadratic barrier for multiplying large integers.