Search a sorted numeric array by estimating the likely position of the target from its value.
Interpolation search is a search algorithm for sorted numeric arrays. Like binary search, it repeatedly narrows the search interval. Unlike binary search, it does not always choose the middle position. It estimates where the target should be based on the values at the ends of the interval.
This works well when values are close to uniformly distributed. For example, if the target is near the value at the upper end of the interval, the algorithm probes near the upper end instead of the middle.
Problem
Given a sorted numeric array of length and a value , find an index such that
If no such index exists, return .
Key Idea
Suppose the current interval is . If values are roughly evenly spaced, the target position can be estimated by linear interpolation:
The algorithm probes index and then discards the side that cannot contain .
Algorithm
interpolation_search(A, x):
l = 0
r = length(A) - 1
while l <= r and A[l] <= x and x <= A[r]:
if A[l] == A[r]:
if A[l] == x:
return l
return -1
m = l + ((x - A[l]) * (r - l)) // (A[r] - A[l])
if A[m] == x:
return m
else if A[m] < x:
l = m + 1
else:
r = m - 1
return -1Example
Let
and
The first estimate is:
Then:
Return index .
Correctness
At each step, the algorithm keeps an interval that may contain . Since the array is sorted, if , then every index at or before has value less than , so the algorithm moves to .
If , then every index at or after has value greater than , so the algorithm moves to .
If , the returned index is correct. If the loop ends, then either the interval is empty or lies outside the value range of the interval, so no match exists.
Complexity
| case | time |
|---|---|
| best | |
| average, near uniform data | |
| worst |
Space:
The worst case occurs when the value distribution is highly uneven, causing the probe position to make little progress.
Comparison with Binary Search
| method | probe choice | average on uniform data | worst case |
|---|---|---|---|
| binary search | middle index | ||
| interpolation search | estimated value position |
Binary search is more robust. Interpolation search can be faster on large, sorted, uniformly distributed numeric data.
When to Use
Use interpolation search when:
- the array is sorted
- values are numeric
- values are roughly uniformly distributed
- random access is available
Avoid it when values are clustered, skewed, or nonnumeric.
Implementation
def interpolation_search(a, x):
l, r = 0, len(a) - 1
while l <= r and a[l] <= x <= a[r]:
if a[l] == a[r]:
return l if a[l] == x else -1
m = l + ((x - a[l]) * (r - l)) // (a[r] - a[l])
if a[m] == x:
return m
elif a[m] < x:
l = m + 1
else:
r = m - 1
return -1func InterpolationSearch(a []int, x int) int {
l, r := 0, len(a)-1
for l <= r && a[l] <= x && x <= a[r] {
if a[l] == a[r] {
if a[l] == x {
return l
}
return -1
}
m := l + ((x-a[l])*(r-l))/(a[r]-a[l])
if a[m] == x {
return m
} else if a[m] < x {
l = m + 1
} else {
r = m - 1
}
}
return -1
}