Skip to content

Double Hashing Search

Resolve hash collisions using a second hash function to define probe steps.

Double hashing reduces clustering by using a second hash function to determine the step size between probes. Each key generates its own probe sequence, which distributes entries more uniformly than linear or quadratic probing.

You search by following the same probe sequence defined by both hash functions.

Problem

Given a hash table using double hashing and a key kk, return the associated value if it exists.

Model

  • Table TT has size mm
  • Primary hash function h1(k)h_1(k)
  • Secondary hash function h2(k)h_2(k)

Probe sequence:

it=(h1(k)+th2(k))modm i_t = \big(h_1(k) + t \cdot h_2(k)\big) \bmod m

with t=0,1,2,t = 0, 1, 2, \dots

Constraint:

h2(k)0andgcd(h2(k),m)=1 h_2(k) \neq 0 \quad \text{and} \quad \gcd(h_2(k), m) = 1

to ensure full coverage of the table.

Algorithm

double_hashing_search(T, k):
    i0 = h1(k)
    step = h2(k)
    for t from 0 to m - 1:
        i = (i0 + t * step) mod m
        if T[i] is empty:
            return NOT_FOUND
        if T[i].key == k:
            return T[i].value
    return NOT_FOUND

Example

Let:

  • m=7m = 7
  • h1(k)=kmod7h_1(k) = k \bmod 7
  • h2(k)=1+(kmod5)h_2(k) = 1 + (k \bmod 5)

Insert:

[10,17,24] [10, 17, 24]

For k=10k = 10:

  • h1(10)=3h_1(10) = 3
  • h2(10)=1+(10mod5)=1h_2(10) = 1 + (10 \bmod 5) = 1

For k=17k = 17:

  • h1(17)=3h_1(17) = 3
  • h2(17)=1+(17mod5)=3h_2(17) = 1 + (17 \bmod 5) = 3

Probe sequence for 17:

  • index 3 → occupied
  • index 6 → placed

Search for k=17k = 17:

  • check index 3 → no
  • check index 6 → match

Correctness

Each key generates a unique probe sequence determined by h1h_1 and h2h_2. The insertion and lookup processes follow identical sequences, so any inserted key will be found.

If an empty slot appears, the key was not inserted.

Complexity

Let α=n/m\alpha = n/m.

casetime
averageO(1)O(1)
worstO(n)O(n)

Double hashing minimizes clustering, so performance remains stable at higher load factors compared to other open addressing methods.

Space

O(m) O(m)

When to Use

Double hashing is appropriate when:

  • clustering must be minimized
  • high load factors are expected
  • uniform distribution is important

It offers better performance than linear and quadratic probing in most practical scenarios.

Implementation

class HashTable:
    def __init__(self, m=8):
        self.m = m
        self.table = [None] * m

    def _h1(self, key):
        return hash(key) % self.m

    def _h2(self, key):
        return 1 + (hash(key) % (self.m - 1))

    def get(self, key):
        i0 = self._h1(key)
        step = self._h2(key)

        for t in range(self.m):
            i = (i0 + t * step) % self.m
            if self.table[i] is None:
                return None
            k, v = self.table[i]
            if k == key:
                return v

        return None
type Entry[K comparable, V any] struct {
	Key   K
	Value V
	Used  bool
}

type HashTable[K comparable, V any] struct {
	Table []Entry[K, V]
	M     int
}

func (h *HashTable[K, V]) h1(key K) int {
	return int(uintptr(any(key))) % h.M
}

func (h *HashTable[K, V]) h2(key K) int {
	return 1 + int(uintptr(any(key)))%(h.M-1)
}

func (h *HashTable[K, V]) Get(key K) (V, bool) {
	i0 := h.h1(key)
	step := h.h2(key)

	for t := 0; t < h.M; t++ {
		i := (i0 + t*step) % h.M
		if !h.Table[i].Used {
			break
		}
		if h.Table[i].Key == key {
			return h.Table[i].Value, true
		}
	}

	var zero V
	return zero, false
}