# Interpolation Search

# Interpolation Search

Interpolation search is a search algorithm for sorted numeric arrays. Like binary search, it repeatedly narrows the search interval. Unlike binary search, it does not always choose the middle position. It estimates where the target should be based on the values at the ends of the interval.

This works well when values are close to uniformly distributed. For example, if the target is near the value at the upper end of the interval, the algorithm probes near the upper end instead of the middle.

## Problem

Given a sorted numeric array $A$ of length $n$ and a value $x$, find an index $i$ such that

$$
A[i] = x
$$

If no such index exists, return $-1$.

## Key Idea

Suppose the current interval is $[l, r]$. If values are roughly evenly spaced, the target position can be estimated by linear interpolation:

$$
m = l + \frac{(x - A[l])(r - l)}{A[r] - A[l]}
$$

The algorithm probes index $m$ and then discards the side that cannot contain $x$.

## Algorithm

```id="intp-1"
interpolation_search(A, x):
    l = 0
    r = length(A) - 1

    while l <= r and A[l] <= x and x <= A[r]:
        if A[l] == A[r]:
            if A[l] == x:
                return l
            return -1

        m = l + ((x - A[l]) * (r - l)) // (A[r] - A[l])

        if A[m] == x:
            return m
        else if A[m] < x:
            l = m + 1
        else:
            r = m - 1

    return -1
```

## Example

Let

$$
A = [10, 20, 30, 40, 50, 60, 70, 80]
$$

and

$$
x = 70
$$

The first estimate is:

$$
m = 0 + \frac{(70 - 10)(7 - 0)}{80 - 10} = 6
$$

Then:

$$
A[6] = 70
$$

Return index $6$.

## Correctness

At each step, the algorithm keeps an interval $[l, r]$ that may contain $x$. Since the array is sorted, if $A[m] < x$, then every index at or before $m$ has value less than $x$, so the algorithm moves to $[m + 1, r]$.

If $A[m] > x$, then every index at or after $m$ has value greater than $x$, so the algorithm moves to $[l, m - 1]$.

If $A[m] = x$, the returned index is correct. If the loop ends, then either the interval is empty or $x$ lies outside the value range of the interval, so no match exists.

## Complexity

| case                       | time             |
| -------------------------- | ---------------- |
| best                       | $O(1)$           |
| average, near uniform data | $O(\log \log n)$ |
| worst                      | $O(n)$           |

Space:

$$
O(1)
$$

The worst case occurs when the value distribution is highly uneven, causing the probe position to make little progress.

## Comparison with Binary Search

| method               | probe choice             | average on uniform data | worst case  |
| -------------------- | ------------------------ | ----------------------- | ----------- |
| binary search        | middle index             | $O(\log n)$             | $O(\log n)$ |
| interpolation search | estimated value position | $O(\log \log n)$        | $O(n)$      |

Binary search is more robust. Interpolation search can be faster on large, sorted, uniformly distributed numeric data.

## When to Use

Use interpolation search when:

* the array is sorted
* values are numeric
* values are roughly uniformly distributed
* random access is available

Avoid it when values are clustered, skewed, or nonnumeric.

## Implementation

```id="py-intp"
def interpolation_search(a, x):
    l, r = 0, len(a) - 1

    while l <= r and a[l] <= x <= a[r]:
        if a[l] == a[r]:
            return l if a[l] == x else -1

        m = l + ((x - a[l]) * (r - l)) // (a[r] - a[l])

        if a[m] == x:
            return m
        elif a[m] < x:
            l = m + 1
        else:
            r = m - 1

    return -1
```

```id="go-intp"
func InterpolationSearch(a []int, x int) int {
    l, r := 0, len(a)-1

    for l <= r && a[l] <= x && x <= a[r] {
        if a[l] == a[r] {
            if a[l] == x {
                return l
            }
            return -1
        }

        m := l + ((x-a[l])*(r-l))/(a[r]-a[l])

        if a[m] == x {
            return m
        } else if a[m] < x {
            l = m + 1
        } else {
            r = m - 1
        }
    }

    return -1
}
```