16.24 Testing String Algorithms
String algorithms are easy to implement incorrectly.
16.24 Testing String Algorithms
Problem
String algorithms are easy to implement incorrectly.
Most failures are small:
Off-by-one error
Wrong boundary condition
Incorrect empty-pattern behavior
Unicode mismatch
Duplicate match omission
Failure to report overlapping matches
These bugs often pass simple tests.
For example, a matcher may work for:
Text: hello world
Pattern: world
but fail for:
Text: aaaaa
Pattern: aaa
because overlapping matches are mishandled.
The challenge is to test string algorithms against both ordinary and adversarial inputs.
Solution
Use layered testing.
A strong test suite should include:
- Simple examples
- Boundary cases
- Overlapping matches
- Repetitive inputs
- Randomized tests
- Cross-checks against a simpler implementation
- Unicode cases
- Performance tests
For most string algorithms, the best testing strategy is to compare an optimized implementation against a small, obviously correct reference implementation.
Start with a Reference Implementation
For exact matching, the reference implementation can be naive.
def naiveFind
(text pattern : String)
: List Nat :=
by
-- Check every possible starting position.
-- Compare characters directly.
sorry
This implementation may be slow, but it is simple.
Optimized algorithms such as KMP, Z Algorithm, Rabin-Karp, and Boyer-Moore should agree with it on small inputs.
Testing rule:
optimized(text, pattern) = naive(text, pattern)
for many generated examples.
Basic Test Cases
Start with cases whose answers are obvious.
| Text | Pattern | Expected |
|---|---|---|
hello |
he |
[0] |
hello |
lo |
[3] |
hello |
x |
[] |
aaaaa |
a |
[0,1,2,3,4] |
aaaaa |
aaa |
[0,1,2] |
abcabcabc |
abc |
[0,3,6] |
These tests catch basic indexing errors.
Boundary Cases
Boundary cases reveal unstated assumptions.
Test:
| Text | Pattern | Expected Behavior |
|---|---|---|
"" |
"" |
Must be specified |
"" |
a |
No match |
a |
"" |
Must be specified |
a |
a |
Match at 0 |
a |
aa |
No match |
aa |
aaa |
No match |
The empty pattern deserves a deliberate policy.
Common choices:
Return every position
Return [0]
Reject as invalid input
Any choice is acceptable if documented and implemented consistently.
Overlapping Matches
Overlapping matches are a frequent source of bugs.
Example:
Text: aaaaa
Pattern: aaa
Correct result:
[0, 1, 2]
Not:
[0]
and not:
[0, 3]
Algorithms that shift too far after a match often miss overlaps.
This is especially important for KMP, Rabin-Karp, Aho-Corasick, and suffix-array searches.
Repetitive Inputs
Repetitive strings are adversarial for many algorithms.
Examples:
aaaaaaaaaaaaaaaa
with patterns:
a
aa
aaa
aaaaab
These cases test:
- Worst-case behavior
- Failure-link correctness
- Hash collision handling
- Boundary arithmetic
- Overlap reporting
A naive algorithm may degrade badly here. KMP and Z should remain linear.
Randomized Cross-Checking
For small alphabets, generate many random strings.
Example alphabet:
a b c
Generate:
text length 0..20
pattern length 0..8
Compare:
naiveFind(text, pattern)
against:
kmpFind(text, pattern)
and:
zFind(text, pattern)
Random testing is powerful because it explores combinations you would not write manually.
Property-Based Testing
A property is a statement that should always hold.
For exact matching:
Every reported position is valid.
Every valid position is reported.
For suffix arrays:
Suffixes are sorted.
Every suffix position appears exactly once.
For LCP arrays:
LCP[i] equals the common prefix length of suffix SA[i-1] and suffix SA[i].
For edit distance:
distance(a, a) = 0
distance(a, b) = distance(b, a)
distance(a, b) ≤ max(length(a), length(b)) when substitution cost is 1
Properties test the algorithm's contract rather than specific examples.
Testing Suffix Arrays
A suffix array must satisfy two invariants.
First, it must be a permutation of all suffix starts:
[0, 1, 2, ..., n-1]
Second, suffixes must be sorted lexicographically.
A reference checker:
def isPermutation
(sa : Array Nat)
(n : Nat)
: Bool :=
by
-- Check each index 0..n-1 appears exactly once.
sorry
def suffixesSorted
(text : String)
(sa : Array Nat)
: Bool :=
by
-- Check suffix at sa[i] <= suffix at sa[i+1].
sorry
Together:
isPermutation(sa, n) && suffixesSorted(text, sa)
is the core suffix-array test.
Testing LCP Arrays
Given a suffix array and LCP array:
lcp[0] = 0
For every i > 0:
lcp[i] =
commonPrefixLength(suffix(sa[i-1]), suffix(sa[i]))
A naive common-prefix checker is enough for tests.
This catches:
- Wrong rank arrays
- Incorrect Kasai updates
- Off-by-one errors
- Misaligned suffix-array ranks
Testing Tries
For a trie built from a dictionary:
inserted(word) → lookup(word) = true
For words not inserted:
lookup(word) = false
unless they are explicitly inserted.
Important cases:
he
her
hero
If only hero is inserted, then he and her should not automatically count as full-word matches unless marked terminal.
This catches prefix-versus-word errors.
Testing Aho-Corasick
Compare Aho-Corasick against naive multi-pattern search.
For each pattern:
naiveFind(text, pattern)
Collect all matches.
Sort them.
Compare with automaton output.
Important cases:
Patterns: he, she, hers, his
Text: ushers
Expected matches include:
she
he
hers
This tests output links and overlapping matches.
Testing Palindrome Algorithms
For Manacher:
radius at center i
should match a naive expand-around-center implementation.
For Eertree:
- Every stored palindrome should truly be a palindrome.
- Every distinct palindrome from a naive enumeration should appear.
- No duplicate palindrome nodes should exist.
Naive enumeration is acceptable for short random strings.
Testing Edit Distance
Use known examples:
| A | B | Distance |
|---|---|---|
kitten |
sitting |
3 |
cat |
cut |
1 |
abc |
abc |
0 |
abc |
"" |
3 |
"" |
abc |
3 |
Then add properties:
distance(a, a) = 0
distance(a, b) = distance(b, a)
distance(a, b) ≤ length(a) + length(b)
If weighted edits are used, symmetry may no longer hold. Adjust the property accordingly.
Testing Hash-Based Algorithms
Hashing needs special care.
A hash match does not prove string equality unless verified.
Test both layers:
- Hash extraction is consistent.
- Algorithms verify candidates when correctness requires certainty.
Useful property:
if substringA = substringB
then hash(substringA) = hash(substringB)
The converse is not guaranteed.
Do not write tests that assume different strings always have different hashes.
Unicode Tests
Include equivalent Unicode representations.
Example:
é
and:
e + combining acute accent
If the algorithm is normalization-aware, they should compare equal after normalization.
Also test:
emoji
accented text
non-Latin scripts
mixed normalization forms
The goal is not to test Unicode itself. The goal is to verify that your algorithm applies one consistent text representation.
Performance Tests
Correctness tests do not reveal scaling failures.
Include large cases:
Text: a repeated 1,000,000 times
Pattern: a repeated 999 times + b
This exposes naive worst-case behavior.
Benchmark:
- Random text
- Highly repetitive text
- Real documents
- Long patterns
- Many short patterns
- Unicode-heavy input
Measure both runtime and memory.
Regression Tests
Every discovered bug should become a test.
If a failure occurs for:
Text: abababa
Pattern: aba
add it permanently.
Regression tests prevent bug fixes from being undone during refactoring.
This is especially important for suffix structures and automata, where small changes can break subtle invariants.
Common Pitfalls
Do not test only happy paths.
Do not omit overlapping matches.
Do not forget empty inputs.
Do not assume hash uniqueness.
Do not skip Unicode cases if the algorithm handles user-visible text.
Do not benchmark only random strings. Repetitive strings are often more revealing.
Do not trust an optimized algorithm without cross-checking it against a simple reference.
Takeaway
Testing string algorithms requires more than a few example inputs. Strong tests combine simple expected cases, adversarial repetitive inputs, randomized cross-checking, property-based invariants, Unicode cases, and performance measurements. The most reliable pattern is to keep a slow, clear reference implementation and use it to validate optimized algorithms over many small inputs. This approach catches the subtle boundary errors that string algorithms are especially prone to.