16.24 Testing String Algorithms

String algorithms are easy to implement incorrectly.

16.24 Testing String Algorithms

Problem

String algorithms are easy to implement incorrectly.

Most failures are small:

Off-by-one error
Wrong boundary condition
Incorrect empty-pattern behavior
Unicode mismatch
Duplicate match omission
Failure to report overlapping matches

These bugs often pass simple tests.

For example, a matcher may work for:

Text:    hello world
Pattern: world

but fail for:

Text:    aaaaa
Pattern: aaa

because overlapping matches are mishandled.

The challenge is to test string algorithms against both ordinary and adversarial inputs.

Solution

Use layered testing.

A strong test suite should include:

  1. Simple examples
  2. Boundary cases
  3. Overlapping matches
  4. Repetitive inputs
  5. Randomized tests
  6. Cross-checks against a simpler implementation
  7. Unicode cases
  8. Performance tests

For most string algorithms, the best testing strategy is to compare an optimized implementation against a small, obviously correct reference implementation.

Start with a Reference Implementation

For exact matching, the reference implementation can be naive.

def naiveFind
  (text pattern : String)
  : List Nat :=
by
  -- Check every possible starting position.
  -- Compare characters directly.
  sorry

This implementation may be slow, but it is simple.

Optimized algorithms such as KMP, Z Algorithm, Rabin-Karp, and Boyer-Moore should agree with it on small inputs.

Testing rule:

optimized(text, pattern) = naive(text, pattern)

for many generated examples.

Basic Test Cases

Start with cases whose answers are obvious.

Text Pattern Expected
hello he [0]
hello lo [3]
hello x []
aaaaa a [0,1,2,3,4]
aaaaa aaa [0,1,2]
abcabcabc abc [0,3,6]

These tests catch basic indexing errors.

Boundary Cases

Boundary cases reveal unstated assumptions.

Test:

Text Pattern Expected Behavior
"" "" Must be specified
"" a No match
a "" Must be specified
a a Match at 0
a aa No match
aa aaa No match

The empty pattern deserves a deliberate policy.

Common choices:

Return every position
Return [0]
Reject as invalid input

Any choice is acceptable if documented and implemented consistently.

Overlapping Matches

Overlapping matches are a frequent source of bugs.

Example:

Text:    aaaaa
Pattern: aaa

Correct result:

[0, 1, 2]

Not:

[0]

and not:

[0, 3]

Algorithms that shift too far after a match often miss overlaps.

This is especially important for KMP, Rabin-Karp, Aho-Corasick, and suffix-array searches.

Repetitive Inputs

Repetitive strings are adversarial for many algorithms.

Examples:

aaaaaaaaaaaaaaaa

with patterns:

a
aa
aaa
aaaaab

These cases test:

  • Worst-case behavior
  • Failure-link correctness
  • Hash collision handling
  • Boundary arithmetic
  • Overlap reporting

A naive algorithm may degrade badly here. KMP and Z should remain linear.

Randomized Cross-Checking

For small alphabets, generate many random strings.

Example alphabet:

a b c

Generate:

text length    0..20
pattern length 0..8

Compare:

naiveFind(text, pattern)

against:

kmpFind(text, pattern)

and:

zFind(text, pattern)

Random testing is powerful because it explores combinations you would not write manually.

Property-Based Testing

A property is a statement that should always hold.

For exact matching:

Every reported position is valid.
Every valid position is reported.

For suffix arrays:

Suffixes are sorted.
Every suffix position appears exactly once.

For LCP arrays:

LCP[i] equals the common prefix length of suffix SA[i-1] and suffix SA[i].

For edit distance:

distance(a, a) = 0
distance(a, b) = distance(b, a)
distance(a, b) ≤ max(length(a), length(b)) when substitution cost is 1

Properties test the algorithm's contract rather than specific examples.

Testing Suffix Arrays

A suffix array must satisfy two invariants.

First, it must be a permutation of all suffix starts:

[0, 1, 2, ..., n-1]

Second, suffixes must be sorted lexicographically.

A reference checker:

def isPermutation
  (sa : Array Nat)
  (n : Nat)
  : Bool :=
by
  -- Check each index 0..n-1 appears exactly once.
  sorry

def suffixesSorted
  (text : String)
  (sa : Array Nat)
  : Bool :=
by
  -- Check suffix at sa[i] <= suffix at sa[i+1].
  sorry

Together:

isPermutation(sa, n) && suffixesSorted(text, sa)

is the core suffix-array test.

Testing LCP Arrays

Given a suffix array and LCP array:

lcp[0] = 0

For every i > 0:

lcp[i] =
commonPrefixLength(suffix(sa[i-1]), suffix(sa[i]))

A naive common-prefix checker is enough for tests.

This catches:

  • Wrong rank arrays
  • Incorrect Kasai updates
  • Off-by-one errors
  • Misaligned suffix-array ranks

Testing Tries

For a trie built from a dictionary:

inserted(word) → lookup(word) = true

For words not inserted:

lookup(word) = false

unless they are explicitly inserted.

Important cases:

he
her
hero

If only hero is inserted, then he and her should not automatically count as full-word matches unless marked terminal.

This catches prefix-versus-word errors.

Testing Aho-Corasick

Compare Aho-Corasick against naive multi-pattern search.

For each pattern:

naiveFind(text, pattern)

Collect all matches.

Sort them.

Compare with automaton output.

Important cases:

Patterns: he, she, hers, his
Text:     ushers

Expected matches include:

she
he
hers

This tests output links and overlapping matches.

Testing Palindrome Algorithms

For Manacher:

radius at center i

should match a naive expand-around-center implementation.

For Eertree:

  • Every stored palindrome should truly be a palindrome.
  • Every distinct palindrome from a naive enumeration should appear.
  • No duplicate palindrome nodes should exist.

Naive enumeration is acceptable for short random strings.

Testing Edit Distance

Use known examples:

A B Distance
kitten sitting 3
cat cut 1
abc abc 0
abc "" 3
"" abc 3

Then add properties:

distance(a, a) = 0
distance(a, b) = distance(b, a)
distance(a, b) ≤ length(a) + length(b)

If weighted edits are used, symmetry may no longer hold. Adjust the property accordingly.

Testing Hash-Based Algorithms

Hashing needs special care.

A hash match does not prove string equality unless verified.

Test both layers:

  1. Hash extraction is consistent.
  2. Algorithms verify candidates when correctness requires certainty.

Useful property:

if substringA = substringB
then hash(substringA) = hash(substringB)

The converse is not guaranteed.

Do not write tests that assume different strings always have different hashes.

Unicode Tests

Include equivalent Unicode representations.

Example:

é

and:

e + combining acute accent

If the algorithm is normalization-aware, they should compare equal after normalization.

Also test:

emoji
accented text
non-Latin scripts
mixed normalization forms

The goal is not to test Unicode itself. The goal is to verify that your algorithm applies one consistent text representation.

Performance Tests

Correctness tests do not reveal scaling failures.

Include large cases:

Text:    a repeated 1,000,000 times
Pattern: a repeated 999 times + b

This exposes naive worst-case behavior.

Benchmark:

  • Random text
  • Highly repetitive text
  • Real documents
  • Long patterns
  • Many short patterns
  • Unicode-heavy input

Measure both runtime and memory.

Regression Tests

Every discovered bug should become a test.

If a failure occurs for:

Text:    abababa
Pattern: aba

add it permanently.

Regression tests prevent bug fixes from being undone during refactoring.

This is especially important for suffix structures and automata, where small changes can break subtle invariants.

Common Pitfalls

Do not test only happy paths.

Do not omit overlapping matches.

Do not forget empty inputs.

Do not assume hash uniqueness.

Do not skip Unicode cases if the algorithm handles user-visible text.

Do not benchmark only random strings. Repetitive strings are often more revealing.

Do not trust an optimized algorithm without cross-checking it against a simple reference.

Takeaway

Testing string algorithms requires more than a few example inputs. Strong tests combine simple expected cases, adversarial repetitive inputs, randomized cross-checking, property-based invariants, Unicode cases, and performance measurements. The most reliable pattern is to keep a slow, clear reference implementation and use it to validate optimized algorithms over many small inputs. This approach catches the subtle boundary errors that string algorithms are especially prone to.