5.20 Testing Hash Logic

Problem

You need confidence that a hash-based structure is correct, stable, and performs well under realistic inputs.

Hash table bugs are often subtle. A map may pass small examples but fail after resizing. A set may work until keys collide. A deterministic hash may appear stable until another language, platform, or encoding rule is introduced.

Solution

Test both logical behavior and structural stress cases.

Basic behavior:

insert key
lookup key
update key
delete key
lookup deleted key

Collision behavior:

insert many keys with same bucket
lookup every inserted key
delete some keys
lookup remaining keys

Resize behavior:

insert enough keys to trigger growth
lookup every key after resizing
delete enough keys to trigger cleanup or shrink
lookup every remaining key

Equality and hashing contract:

if equal(a, b):
    assert hash(a) == hash(b)

Discussion

Testing hash logic requires more than checking ordinary lookup examples. The implementation has several independent responsibilities: it must compute indexes consistently, preserve entries during resizing, handle collisions, compare keys correctly, and maintain size metadata.

A good test suite separates these concerns. Basic operation tests verify the public interface. Collision tests force the implementation through paths that random inputs may rarely exercise. Resize tests check that rehashing preserves all entries. Contract tests check that equality and hashing agree.

Randomized testing is useful, but it should be paired with adversarial cases. Random keys usually distribute well, so they can hide collision bugs. Constructed keys that deliberately share a hash value expose bucket scanning, probing, deletion, and tombstone behavior.

Differential testing is also effective. Compare the custom table against a trusted reference implementation.

for operation in random_operations:
    apply operation to custom_map
    apply operation to reference_map
    assert observable_state(custom_map) == observable_state(reference_map)

The observable state includes lookup results, size, and iteration output when iteration order is specified.

Correctness

Testing should target the main invariants.

Placement invariant:

each key is stored where lookup will search

Uniqueness invariant:

no two live entries have equal keys

Size invariant:

size equals the number of live entries

Probe invariant for open addressing:

lookup continues through occupied slots and tombstones

Resize invariant:

after rehashing, every live entry is reachable

Each test should be designed to break one of these invariants if the implementation is wrong.

Complexity Testing

Performance tests should measure behavior under different load factors and collision patterns.

Useful scenarios:

- random keys
- sequential integer keys
- similar string keys
- long string keys
- composite keys
- deliberate collisions
- insert-heavy workload
- lookup-heavy workload
- delete-heavy workload

Measure at least:

- average lookup time
- tail lookup time
- resize cost
- memory usage
- collision count or probe length

Average time can hide severe outliers. Probe length distribution is often more informative than total runtime.

Example

A collision test can use a custom key type whose hash is constant.

BadHashKey:
    value

hash(key):
    return 0

equal(a, b):
    return a.value == b.value

Test:

map = empty map

for i from 0 to 999:
    put(map, BadHashKey(i), i)

for i from 0 to 999:
    assert get(map, BadHashKey(i)) == i

for i from 0 to 499:
    remove(map, BadHashKey(i))

for i from 0 to 499:
    assert get(map, BadHashKey(i)) is not_found

for i from 500 to 999:
    assert get(map, BadHashKey(i)) == i

This test forces all entries into the same collision region. It is slow by design, but it reveals correctness errors in collision handling.

Deterministic Hash Tests

For persistent hashes, add golden tests.

assert hash("abc") == 0x...
assert hash(("user", 42)) == 0x...
assert hash(empty_bytes) == 0x...

Golden tests catch accidental changes to encoding, byte order, seeds, or algorithm version.

For cross-language systems, generate the same test vectors in every implementation.

Common Mistakes

Testing only random keys.

Testing insertion without deletion.

Testing deletion without reinsertion.

Ignoring resize boundaries.

Failing to test equal keys with distinct object identities.

Failing to test composite keys.

Using benchmark results without checking correctness first.

Assuming deterministic hashes remain stable without golden tests.