Verify correctness, stability, and performance of hash-based structures through randomized and adversarial test strategies.
Problem
You need confidence that a hash-based structure is correct, stable, and performs well under realistic inputs.
Hash table bugs are often subtle. A map may pass small examples but fail after resizing. A set may work until keys collide. A deterministic hash may appear stable until another language, platform, or encoding rule is introduced.
Solution
Test both logical behavior and structural stress cases.
Basic behavior:
insert key
lookup key
update key
delete key
lookup deleted keyCollision behavior:
insert many keys with same bucket
lookup every inserted key
delete some keys
lookup remaining keysResize behavior:
insert enough keys to trigger growth
lookup every key after resizing
delete enough keys to trigger cleanup or shrink
lookup every remaining keyEquality and hashing contract:
if equal(a, b):
assert hash(a) == hash(b)Discussion
Testing hash logic requires more than checking ordinary lookup examples. The implementation has several independent responsibilities: it must compute indexes consistently, preserve entries during resizing, handle collisions, compare keys correctly, and maintain size metadata.
A good test suite separates these concerns. Basic operation tests verify the public interface. Collision tests force the implementation through paths that random inputs may rarely exercise. Resize tests check that rehashing preserves all entries. Contract tests check that equality and hashing agree.
Randomized testing is useful, but it should be paired with adversarial cases. Random keys usually distribute well, so they can hide collision bugs. Constructed keys that deliberately share a hash value expose bucket scanning, probing, deletion, and tombstone behavior.
Differential testing is also effective. Compare the custom table against a trusted reference implementation.
for operation in random_operations:
apply operation to custom_map
apply operation to reference_map
assert observable_state(custom_map) == observable_state(reference_map)The observable state includes lookup results, size, and iteration output when iteration order is specified.
Correctness
Testing should target the main invariants.
Placement invariant:
each key is stored where lookup will searchUniqueness invariant:
no two live entries have equal keysSize invariant:
size equals the number of live entriesProbe invariant for open addressing:
lookup continues through occupied slots and tombstonesResize invariant:
after rehashing, every live entry is reachableEach test should be designed to break one of these invariants if the implementation is wrong.
Complexity Testing
Performance tests should measure behavior under different load factors and collision patterns.
Useful scenarios:
- random keys
- sequential integer keys
- similar string keys
- long string keys
- composite keys
- deliberate collisions
- insert-heavy workload
- lookup-heavy workload
- delete-heavy workloadMeasure at least:
- average lookup time
- tail lookup time
- resize cost
- memory usage
- collision count or probe lengthAverage time can hide severe outliers. Probe length distribution is often more informative than total runtime.
Example
A collision test can use a custom key type whose hash is constant.
BadHashKey:
value
hash(key):
return 0
equal(a, b):
return a.value == b.valueTest:
map = empty map
for i from 0 to 999:
put(map, BadHashKey(i), i)
for i from 0 to 999:
assert get(map, BadHashKey(i)) == i
for i from 0 to 499:
remove(map, BadHashKey(i))
for i from 0 to 499:
assert get(map, BadHashKey(i)) is not_found
for i from 500 to 999:
assert get(map, BadHashKey(i)) == iThis test forces all entries into the same collision region. It is slow by design, but it reveals correctness errors in collision handling.
Deterministic Hash Tests
For persistent hashes, add golden tests.
assert hash("abc") == 0x...
assert hash(("user", 42)) == 0x...
assert hash(empty_bytes) == 0x...Golden tests catch accidental changes to encoding, byte order, seeds, or algorithm version.
For cross-language systems, generate the same test vectors in every implementation.
Common Mistakes
Testing only random keys.
Testing insertion without deletion.
Testing deletion without reinsertion.
Ignoring resize boundaries.
Failing to test equal keys with distinct object identities.
Failing to test composite keys.
Using benchmark results without checking correctness first.
Assuming deterministic hashes remain stable without golden tests.