Skip to content

Hash Index Probe

Retrieve rows by probing a hash-based index structure using an exact key match.

Hash index probe retrieves records from a table using a hash index. The index maps key values to locations of rows, enabling fast equality lookups.

You use it in databases and storage systems when queries involve exact key matches.

Problem

Given a table with a hash index on attribute kk, and a query value xx, retrieve all rows such that:

row.k=x row.k = x

Model

A hash index consists of:

  • a hash function h(k)h(k)
  • an index table mapping hash buckets to row references

The index stores:

I[h(k)]row identifiers I[h(k)] \rightarrow {\text{row identifiers}}

To query a key xx, compute:

i=h(x) i = h(x)

Then retrieve all row identifiers stored in bucket I[i]I[i].

Algorithm

hash_index_probe(I, x):
    i = h(x)
    bucket = I[i]

    result = empty list
    for each rid in bucket:
        row = fetch(rid)
        if row.key == x:
            append row to result

    return result

The equality check is required because different keys may hash to the same bucket.

Example

Table:

idname
1Ada
2Linus
3Grace
4Ada

Hash index on name:

hash bucketrow ids
h(“Ada”)[1, 4]
h(“Linus”)[2]
h(“Grace”)[3]

Query:

x="Ada" x = \text{"Ada"}

Steps:

  • compute h("Ada")h("Ada")
  • retrieve bucket → [1, 4]
  • fetch rows 1 and 4
  • both match

Result:

idname
1Ada
4Ada

Correctness

Every row with key value kk is inserted into the bucket corresponding to h(k)h(k). Therefore, all rows with k=xk = x appear in bucket h(x)h(x).

The probe scans that bucket and filters rows by exact comparison, ensuring that all matching rows are returned and no incorrect rows are included.

Complexity

Let bb be the bucket size.

operationtime
lookupO(1+b)O(1 + b)

Expected constant time when the hash function distributes keys uniformly and buckets remain small.

Worst case:

O(n) O(n)

when all keys collide into one bucket.

Space

O(n) O(n)

The index stores references to all rows.

When to Use

Hash index probe is appropriate when:

  • queries use equality conditions
  • fast point lookups are required
  • data is not sorted
  • range queries are not needed

It is widely used in database engines, key value stores, and indexing systems.

Implementation

from collections import defaultdict

class HashIndex:
    def __init__(self):
        self.index = defaultdict(list)
        self.rows = {}

    def insert(self, rid, row, key):
        self.rows[rid] = row
        self.index[row[key]].append(rid)

    def probe(self, key, value):
        result = []
        for rid in self.index.get(value, []):
            row = self.rows[rid]
            if row[key] == value:
                result.append(row)
        return result
type Row map[string]string

type HashIndex struct {
	Index map[string][]int
	Rows  map[int]Row
}

func (h *HashIndex) Insert(rid int, row Row, key string) {
	if h.Index == nil {
		h.Index = make(map[string][]int)
		h.Rows = make(map[int]Row)
	}
	h.Rows[rid] = row
	v := row[key]
	h.Index[v] = append(h.Index[v], rid)
}

func (h *HashIndex) Probe(key string, value string) []Row {
	var result []Row

	for _, rid := range h.Index[value] {
		row := h.Rows[rid]
		if row[key] == value {
			result = append(result, row)
		}
	}

	return result
}