Search in an open addressing hash table that keeps probe distances balanced across keys.
Robin Hood hashing is an open addressing method that tries to reduce variance in probe lengths. During insertion, a key with a larger probe distance may displace a key with a smaller probe distance. This rule gives longer-searching keys priority and keeps lookup times more uniform.
Lookup follows the same linear probe sequence as ordinary linear probing, but it can stop early when the current entry has a smaller probe distance than the query key.
Problem
Given a Robin Hood hash table and a key , return the associated value if it exists.
Model
- Table has size
- Hash function maps keys to home buckets
- Probe distance of a key at index is:
During lookup, the query also has a current probe distance. If the current entry’s distance is smaller than the query distance, the key cannot appear later in the cluster.
Algorithm
robin_hood_lookup(T, k):
home = h(k)
for d from 0 to m - 1:
i = (home + d) mod m
if T[i] is empty:
return NOT_FOUND
if T[i].key == k:
return T[i].value
entry_home = h(T[i].key)
entry_distance = (i - entry_home) mod m
if entry_distance < d:
return NOT_FOUND
return NOT_FOUNDExample
Let:
Suppose a table contains:
| index | key | home | distance |
|---|---|---|---|
| 3 | 11 | 3 | 0 |
| 4 | 19 | 3 | 1 |
| 5 | 27 | 3 | 2 |
| 6 | 14 | 6 | 0 |
Search for key , whose home bucket is:
The lookup checks:
| probe distance | index | key | entry distance |
|---|---|---|---|
| 0 | 3 | 11 | 0 |
| 1 | 4 | 19 | 1 |
| 2 | 5 | 27 | 2 |
| 3 | 6 | 14 | 0 |
At index 6, the stored entry has distance 0, which is less than the query distance 3. Therefore key 35 cannot appear later, and the lookup stops.
Correctness
Robin Hood insertion keeps entries ordered so that probe distances in a cluster do not decrease before all keys with larger search distance have been considered. If lookup reaches an empty slot, the key was never placed in the table.
If lookup reaches an entry whose probe distance is smaller than the query’s current distance, insertion would have placed the query key before that entry. Therefore the queried key cannot appear later in the same cluster.
Complexity
| case | time |
|---|---|
| average | |
| worst |
Robin Hood hashing does not improve the theoretical worst case, but it reduces the spread of probe lengths in practice.
Space
The table stores entries directly. Some implementations also store cached hashes or probe distances to avoid recomputing them during lookup.
When to Use
Robin Hood hashing is appropriate when:
- lookup latency should be predictable
- cache locality matters
- open addressing is preferred
- high load factors are expected
It is a strong default for practical hash tables when deletion and resizing are implemented carefully.
Implementation
class RobinHoodHashTable:
def __init__(self, m=16):
self.m = m
self.table = [None] * m
def _hash(self, key):
return hash(key) % self.m
def get(self, key):
home = self._hash(key)
for d in range(self.m):
i = (home + d) % self.m
entry = self.table[i]
if entry is None:
return None
k, v = entry
if k == key:
return v
entry_home = self._hash(k)
entry_distance = (i - entry_home) % self.m
if entry_distance < d:
return None
return Nonetype Entry[K comparable, V any] struct {
Key K
Value V
Used bool
}
type RobinHoodHashTable[K comparable, V any] struct {
Table []Entry[K, V]
M int
}
func (h *RobinHoodHashTable[K, V]) hash(key K) int {
return int(uintptr(any(key))) % h.M
}
func (h *RobinHoodHashTable[K, V]) Get(key K) (V, bool) {
home := h.hash(key)
for d := 0; d < h.M; d++ {
i := (home + d) % h.M
if !h.Table[i].Used {
break
}
if h.Table[i].Key == key {
return h.Table[i].Value, true
}
entryHome := h.hash(h.Table[i].Key)
entryDistance := (i - entryHome + h.M) % h.M
if entryDistance < d {
break
}
}
var zero V
return zero, false
}