# Hash Index Probe

# Hash Index Probe

Hash index probe retrieves records from a table using a hash index. The index maps key values to locations of rows, enabling fast equality lookups.

You use it in databases and storage systems when queries involve exact key matches.

## Problem

Given a table with a hash index on attribute $k$, and a query value $x$, retrieve all rows such that:

$$
row.k = x
$$

## Model

A hash index consists of:

* a hash function $h(k)$
* an index table mapping hash buckets to row references

The index stores:

$$
I[h(k)] \rightarrow {\text{row identifiers}}
$$

To query a key $x$, compute:

$$
i = h(x)
$$

Then retrieve all row identifiers stored in bucket $I[i]$.

## Algorithm

```text id="m8q5sj"
hash_index_probe(I, x):
    i = h(x)
    bucket = I[i]

    result = empty list
    for each rid in bucket:
        row = fetch(rid)
        if row.key == x:
            append row to result

    return result
```

The equality check is required because different keys may hash to the same bucket.

## Example

Table:

| id | name  |
| -: | ----- |
|  1 | Ada   |
|  2 | Linus |
|  3 | Grace |
|  4 | Ada   |

Hash index on `name`:

| hash bucket | row ids |
| ----------- | ------- |
| h("Ada")    | [1, 4]  |
| h("Linus")  | [2]     |
| h("Grace")  | [3]     |

Query:

$$
x = \text{"Ada"}
$$

Steps:

* compute $h("Ada")$
* retrieve bucket → [1, 4]
* fetch rows 1 and 4
* both match

Result:

| id | name |
| -: | ---- |
|  1 | Ada  |
|  4 | Ada  |

## Correctness

Every row with key value $k$ is inserted into the bucket corresponding to $h(k)$. Therefore, all rows with $k = x$ appear in bucket $h(x)$.

The probe scans that bucket and filters rows by exact comparison, ensuring that all matching rows are returned and no incorrect rows are included.

## Complexity

Let $b$ be the bucket size.

| operation | time       |
| --------- | ---------- |
| lookup    | $O(1 + b)$ |

Expected constant time when the hash function distributes keys uniformly and buckets remain small.

Worst case:

$$
O(n)
$$

when all keys collide into one bucket.

## Space

$$
O(n)
$$

The index stores references to all rows.

## When to Use

Hash index probe is appropriate when:

* queries use equality conditions
* fast point lookups are required
* data is not sorted
* range queries are not needed

It is widely used in database engines, key value stores, and indexing systems.

## Implementation

```python id="w3m1qp"
from collections import defaultdict

class HashIndex:
    def __init__(self):
        self.index = defaultdict(list)
        self.rows = {}

    def insert(self, rid, row, key):
        self.rows[rid] = row
        self.index[row[key]].append(rid)

    def probe(self, key, value):
        result = []
        for rid in self.index.get(value, []):
            row = self.rows[rid]
            if row[key] == value:
                result.append(row)
        return result
```

```go id="c4n8xv"
type Row map[string]string

type HashIndex struct {
	Index map[string][]int
	Rows  map[int]Row
}

func (h *HashIndex) Insert(rid int, row Row, key string) {
	if h.Index == nil {
		h.Index = make(map[string][]int)
		h.Rows = make(map[int]Row)
	}
	h.Rows[rid] = row
	v := row[key]
	h.Index[v] = append(h.Index[v], rid)
}

func (h *HashIndex) Probe(key string, value string) []Row {
	var result []Row

	for _, rid := range h.Index[value] {
		row := h.Rows[rid]
		if row[key] == value {
			result = append(result, row)
		}
	}

	return result
}
```

