Skip to content

Counting Bloom Filter Membership

Test membership with a Bloom filter variant that supports deletions using small counters instead of bits.

A counting Bloom filter extends the Bloom filter by replacing each bit with a small counter. This allows both insertion and deletion. Membership queries follow the same logic as a standard Bloom filter, but check that all counters are positive.

You use it when you need approximate membership with support for removal.

Problem

Given a multiset-like structure supporting insert and delete, and a query key kk, determine whether kk may belong to the set.

Model

  • Counter array CC of length mm
  • rr hash functions h1,,hrh_1, \dots, h_r

Each hash maps a key to a counter index:

hj(k)0,1,,m1 h_j(k) \in {0, 1, \dots, m - 1}

Operations:

  • insert: increment all C[hj(k)]C[h_j(k)]
  • delete: decrement all C[hj(k)]C[h_j(k)] if positive
  • lookup: check all counters

Algorithm

counting_bloom_lookup(C, k):
    for each hash function h_j:
        i = h_j(k)
        if C[i] == 0:
            return DEFINITELY_NOT_PRESENT
    return POSSIBLY_PRESENT

Example

Let m=10m = 10, r=3r = 3, and:

h1(k)=2,h2(k)=5,h3(k)=8 h_1(k)=2,\quad h_2(k)=5,\quad h_3(k)=8

Counters:

index0123456789
count0120010130

All required counters are positive, so the result is possibly present.

If any counter were zero, the key would be definitely absent.

Correctness

Insertion increments all corresponding counters. Therefore, any inserted key will always have all its counters positive.

If a counter is zero during lookup, then no inserted key could have produced that index, so the key is definitely absent.

False positives remain possible because different keys can increment the same counters.

Complexity

operationtime
lookupO(r)O(r)

With constant rr, lookup runs in constant time.

Space

O(m) O(m)

Each position stores a small counter, often 4 or 8 bits.

When to Use

Counting Bloom filters are appropriate when:

  • deletions are required
  • approximate membership is acceptable
  • memory is constrained but slightly larger than standard Bloom filters
  • false positives are tolerable

They are used in streaming systems, caches, networking, and distributed storage.

Implementation

class CountingBloomFilter:
    def __init__(self, m=1024, r=3):
        self.m = m
        self.r = r
        self.counts = [0] * m

    def _hashes(self, key):
        for seed in range(self.r):
            yield hash((seed, key)) % self.m

    def add(self, key):
        for i in self._hashes(key):
            self.counts[i] += 1

    def remove(self, key):
        for i in self._hashes(key):
            if self.counts[i] > 0:
                self.counts[i] -= 1

    def contains(self, key):
        for i in self._hashes(key):
            if self.counts[i] == 0:
                return False
        return True
type CountingBloomFilter struct {
	Counts []uint16
	M      uint64
	R      uint64
}

func (b *CountingBloomFilter) hash(key string, seed uint64) uint64 {
	var h uint64 = 1469598103934665603 + seed
	for i := 0; i < len(key); i++ {
		h ^= uint64(key[i])
		h *= 1099511628211
	}
	return h % b.M
}

func (b *CountingBloomFilter) Contains(key string) bool {
	for seed := uint64(0); seed < b.R; seed++ {
		i := b.hash(key, seed)
		if b.Counts[i] == 0 {
			return false
		}
	}
	return true
}