Double Hashing Search

Double hashing reduces clustering by using a second hash function to determine the step size between probes. Each key generates its own probe sequence, which distributes entries more uniformly than linear or quadratic probing.

You search by following the same probe sequence defined by both hash functions.

Problem

Given a hash table using double hashing and a key $k$ , return the associated value if it exists.

Model

Table $T$ has size $m$
Primary hash function $h_1(k)$
Secondary hash function $h_2(k)$

Probe sequence:

i_t = \big(h_1(k) + t \cdot h_2(k)\big) \bmod m

with $t = 0, 1, 2, \dots$

Constraint:

h_2(k) \neq 0 \quad \text{and} \quad \gcd(h_2(k), m) = 1

to ensure full coverage of the table.

Algorithm

double_hashing_search(T, k):
    i0 = h1(k)
    step = h2(k)
    for t from 0 to m - 1:
        i = (i0 + t * step) mod m
        if T[i] is empty:
            return NOT_FOUND
        if T[i].key == k:
            return T[i].value
    return NOT_FOUND

Example

Let:

$m = 7$
$h_1(k) = k \bmod 7$
$h_2(k) = 1 + (k \bmod 5)$

Insert:

[10, 17, 24]

For $k = 10$ :

$h_1(10) = 3$
$h_2(10) = 1 + (10 \bmod 5) = 1$

For $k = 17$ :

$h_1(17) = 3$
$h_2(17) = 1 + (17 \bmod 5) = 3$

Probe sequence for 17:

index 3 → occupied
index 6 → placed

Search for $k = 17$ :

check index 3 → no
check index 6 → match

Correctness

Each key generates a unique probe sequence determined by $h_1$ and $h_2$ . The insertion and lookup processes follow identical sequences, so any inserted key will be found.

If an empty slot appears, the key was not inserted.

Complexity

Let $\alpha = n/m$ .

case	time
average	$O(1)$
worst	$O(n)$

Double hashing minimizes clustering, so performance remains stable at higher load factors compared to other open addressing methods.

Space

O(m)

When to Use

Double hashing is appropriate when:

clustering must be minimized
high load factors are expected
uniform distribution is important

It offers better performance than linear and quadratic probing in most practical scenarios.

Implementation

class HashTable:
    def __init__(self, m=8):
        self.m = m
        self.table = [None] * m

    def _h1(self, key):
        return hash(key) % self.m

    def _h2(self, key):
        return 1 + (hash(key) % (self.m - 1))

    def get(self, key):
        i0 = self._h1(key)
        step = self._h2(key)

        for t in range(self.m):
            i = (i0 + t * step) % self.m
            if self.table[i] is None:
                return None
            k, v = self.table[i]
            if k == key:
                return v

        return None

type Entry[K comparable, V any] struct {
	Key   K
	Value V
	Used  bool
}

type HashTable[K comparable, V any] struct {
	Table []Entry[K, V]
	M     int
}

func (h *HashTable[K, V]) h1(key K) int {
	return int(uintptr(any(key))) % h.M
}

func (h *HashTable[K, V]) h2(key K) int {
	return 1 + int(uintptr(any(key)))%(h.M-1)
}

func (h *HashTable[K, V]) Get(key K) (V, bool) {
	i0 := h.h1(key)
	step := h.h2(key)

	for t := 0; t < h.M; t++ {
		i := (i0 + t*step) % h.M
		if !h.Table[i].Used {
			break
		}
		if h.Table[i].Key == key {
			return h.Table[i].Value, true
		}
	}

	var zero V
	return zero, false
}