Hash Index Probe

Hash index probe retrieves records from a table using a hash index. The index maps key values to locations of rows, enabling fast equality lookups.

You use it in databases and storage systems when queries involve exact key matches.

Problem

Given a table with a hash index on attribute $k$, and a query value $x$, retrieve all rows such that:

$$ row.k = x $$

Model

A hash index consists of:

a hash function $h(k)$
an index table mapping hash buckets to row references

The index stores:

$$ I[h(k)] \rightarrow {\text{row identifiers}} $$

To query a key $x$, compute:

$$ i = h(x) $$

Then retrieve all row identifiers stored in bucket $I[i]$.

Algorithm

hash_index_probe(I, x):
    i = h(x)
    bucket = I[i]

    result = empty list
    for each rid in bucket:
        row = fetch(rid)
        if row.key == x:
            append row to result

    return result

The equality check is required because different keys may hash to the same bucket.

Example

Table:

id	name
1	Ada
2	Linus
3	Grace
4	Ada

Hash index on name:

hash bucket	row ids
h("Ada")	[1, 4]
h("Linus")	[2]
h("Grace")	[3]

Query:

$$ x = \text{"Ada"} $$

Steps:

compute $h("Ada")$
retrieve bucket → [1, 4]
fetch rows 1 and 4
both match

Result:

id	name
1	Ada
4	Ada

Correctness

Every row with key value $k$ is inserted into the bucket corresponding to $h(k)$. Therefore, all rows with $k = x$ appear in bucket $h(x)$.

The probe scans that bucket and filters rows by exact comparison, ensuring that all matching rows are returned and no incorrect rows are included.

Complexity

Let $b$ be the bucket size.

operation	time
lookup	$O(1 + b)$

Expected constant time when the hash function distributes keys uniformly and buckets remain small.

Worst case:

$$ O(n) $$

when all keys collide into one bucket.

Space

$$ O(n) $$

The index stores references to all rows.

When to Use

Hash index probe is appropriate when:

queries use equality conditions
fast point lookups are required
data is not sorted
range queries are not needed

It is widely used in database engines, key value stores, and indexing systems.

Implementation

from collections import defaultdict

class HashIndex:
    def __init__(self):
        self.index = defaultdict(list)
        self.rows = {}

    def insert(self, rid, row, key):
        self.rows[rid] = row
        self.index[row[key]].append(rid)

    def probe(self, key, value):
        result = []
        for rid in self.index.get(value, []):
            row = self.rows[rid]
            if row[key] == value:
                result.append(row)
        return result

type Row map[string]string

type HashIndex struct {
	Index map[string][]int
	Rows  map[int]Row
}

func (h *HashIndex) Insert(rid int, row Row, key string) {
	if h.Index == nil {
		h.Index = make(map[string][]int)
		h.Rows = make(map[int]Row)
	}
	h.Rows[rid] = row
	v := row[key]
	h.Index[v] = append(h.Index[v], rid)
}

func (h *HashIndex) Probe(key string, value string) []Row {
	var result []Row

	for _, rid := range h.Index[value] {
		row := h.Rows[rid]
		if row[key] == value {
			result = append(result, row)
		}
	}

	return result
}