# Quotient Filter Lookup

# Quotient Filter Lookup

A quotient filter is a compact approximate membership structure. Like a Bloom filter, it can return false positives but not false negatives when maintained correctly. Unlike a standard Bloom filter, it stores short hash remainders in a table and supports locality friendly lookup.

You use it when approximate membership is required and the structure must support practical deletion, merging, or cache efficient scans.

## Problem

Given a quotient filter and a query key $k$, determine whether $k$ may belong to the represented set.

The result is one of:

* definitely not present
* possibly present

## Model

Hash the key into a fingerprint:

$$
f = h(k)
$$

Split the fingerprint into two parts:

$$
q = \operatorname{quotient}(f)
$$

$$
r = \operatorname{remainder}(f)
$$

The quotient $q$ selects a canonical bucket. The remainder $r$ is stored somewhere in the run associated with that bucket.

Each slot usually stores:

* a remainder
* an occupied bit
* a continuation bit
* a shifted bit

These metadata bits encode where runs begin and how displaced remainders should be interpreted.

## Algorithm

```text id="h7wq5p"
quotient_filter_lookup(QF, k):
    f = h(k)
    q = quotient(f)
    r = remainder(f)

    if QF[q].occupied == false:
        return DEFINITELY_NOT_PRESENT

    run = find_run_for_bucket(QF, q)

    for each slot in run:
        if slot.remainder == r:
            return POSSIBLY_PRESENT

    return DEFINITELY_NOT_PRESENT
```

## Example

Suppose the hash fingerprint for a key is:

$$
f = 101101_2
$$

Use the upper bits as the quotient and lower bits as the remainder:

$$
q = 101_2 = 5
$$

$$
r = 101_2 = 5
$$

Lookup checks the run associated with bucket 5 and searches for remainder 5.

| value                    | meaning                     |
| ------------------------ | --------------------------- |
| quotient 5               | canonical bucket            |
| remainder 5              | stored fingerprint fragment |
| matching remainder found | possibly present            |
| no matching remainder    | definitely not present      |

A match means only that some stored key has the same fingerprint fragment. A different key can collide, so the result remains probabilistic.

## Correctness

When a key is inserted, its quotient determines the bucket whose run owns its remainder. Even if the remainder is shifted away from its canonical slot, the metadata bits preserve the run structure.

Lookup reconstructs the run for the quotient and scans all remainders in that run. If the key was inserted, its remainder appears in that run. If the remainder is absent, the key was not inserted.

False positives occur when another key produces the same quotient and remainder.

## Complexity

| case           | time   |
| -------------- | ------ |
| average lookup | $O(1)$ |
| worst lookup   | $O(n)$ |

Expected constant time assumes a good hash function and bounded load factor.

## Space

$$
O(m)
$$

A quotient filter stores compact remainders plus a few metadata bits per slot.

## When to Use

Quotient filters are appropriate when:

* approximate membership is acceptable
* compact memory layout matters
* deletion support is useful
* cache locality matters
* filters may need to be merged or resized

They are often considered when Bloom filters are too rigid for deletion or locality sensitive workloads.

## Implementation

```python id="o5fa8v"
class QuotientFilter:
    def __init__(self, q_bits=8, r_bits=8):
        self.q_bits = q_bits
        self.r_bits = r_bits
        self.size = 1 << q_bits
        self.table = [[] for _ in range(self.size)]

    def _fingerprint(self, key):
        mask = (1 << (self.q_bits + self.r_bits)) - 1
        return hash(key) & mask

    def _split(self, f):
        r_mask = (1 << self.r_bits) - 1
        q = f >> self.r_bits
        r = f & r_mask
        return q, r

    def contains(self, key):
        f = self._fingerprint(key)
        q, r = self._split(f)
        return r in self.table[q]
```

```go id="or1r2p"
type QuotientFilter struct {
	Table [][]uint64
	QBits uint
	RBits uint
}

func (qf *QuotientFilter) fingerprint(key string) uint64 {
	var h uint64 = 1469598103934665603
	for i := 0; i < len(key); i++ {
		h ^= uint64(key[i])
		h *= 1099511628211
	}

	mask := uint64(1)<<(qf.QBits+qf.RBits) - 1
	return h & mask
}

func (qf *QuotientFilter) split(f uint64) (uint64, uint64) {
	rMask := uint64(1)<<qf.RBits - 1
	q := f >> qf.RBits
	r := f & rMask
	return q, r
}

func (qf *QuotientFilter) Contains(key string) bool {
	f := qf.fingerprint(key)
	q, r := qf.split(f)

	for _, x := range qf.Table[q] {
		if x == r {
			return true
		}
	}

	return false
}
```

