# Counting Bloom Filter Membership

# Counting Bloom Filter Membership

A counting Bloom filter extends the Bloom filter by replacing each bit with a small counter. This allows both insertion and deletion. Membership queries follow the same logic as a standard Bloom filter, but check that all counters are positive.

You use it when you need approximate membership with support for removal.

## Problem

Given a multiset-like structure supporting insert and delete, and a query key $k$, determine whether $k$ may belong to the set.

## Model

* Counter array $C$ of length $m$
* $r$ hash functions $h_1, \dots, h_r$

Each hash maps a key to a counter index:

$$
h_j(k) \in {0, 1, \dots, m - 1}
$$

Operations:

* insert: increment all $C[h_j(k)]$
* delete: decrement all $C[h_j(k)]$ if positive
* lookup: check all counters

## Algorithm

```text id="y7r4tb"
counting_bloom_lookup(C, k):
    for each hash function h_j:
        i = h_j(k)
        if C[i] == 0:
            return DEFINITELY_NOT_PRESENT
    return POSSIBLY_PRESENT
```

## Example

Let $m = 10$, $r = 3$, and:

$$
h_1(k)=2,\quad h_2(k)=5,\quad h_3(k)=8
$$

Counters:

| index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| ----- | - | - | - | - | - | - | - | - | - | - |
| count | 0 | 1 | 2 | 0 | 0 | 1 | 0 | 1 | 3 | 0 |

All required counters are positive, so the result is possibly present.

If any counter were zero, the key would be definitely absent.

## Correctness

Insertion increments all corresponding counters. Therefore, any inserted key will always have all its counters positive.

If a counter is zero during lookup, then no inserted key could have produced that index, so the key is definitely absent.

False positives remain possible because different keys can increment the same counters.

## Complexity

| operation | time   |
| --------- | ------ |
| lookup    | $O(r)$ |

With constant $r$, lookup runs in constant time.

## Space

$$
O(m)
$$

Each position stores a small counter, often 4 or 8 bits.

## When to Use

Counting Bloom filters are appropriate when:

* deletions are required
* approximate membership is acceptable
* memory is constrained but slightly larger than standard Bloom filters
* false positives are tolerable

They are used in streaming systems, caches, networking, and distributed storage.

## Implementation

```python id="y2h8cr"
class CountingBloomFilter:
    def __init__(self, m=1024, r=3):
        self.m = m
        self.r = r
        self.counts = [0] * m

    def _hashes(self, key):
        for seed in range(self.r):
            yield hash((seed, key)) % self.m

    def add(self, key):
        for i in self._hashes(key):
            self.counts[i] += 1

    def remove(self, key):
        for i in self._hashes(key):
            if self.counts[i] > 0:
                self.counts[i] -= 1

    def contains(self, key):
        for i in self._hashes(key):
            if self.counts[i] == 0:
                return False
        return True
```

```go id="j8q2df"
type CountingBloomFilter struct {
	Counts []uint16
	M      uint64
	R      uint64
}

func (b *CountingBloomFilter) hash(key string, seed uint64) uint64 {
	var h uint64 = 1469598103934665603 + seed
	for i := 0; i < len(key); i++ {
		h ^= uint64(key[i])
		h *= 1099511628211
	}
	return h % b.M
}

func (b *CountingBloomFilter) Contains(key string) bool {
	for seed := uint64(0); seed < b.R; seed++ {
		i := b.hash(key, seed)
		if b.Counts[i] == 0 {
			return false
		}
	}
	return true
}
```

