15.19 Correctness Proofs
A divide-and-conquer algorithm may look correct because its structure is simple: ```text split solve recursively
15.19 Correctness Proofs
Problem
A divide-and-conquer algorithm may look correct because its structure is simple:
split
solve recursively
combine
But many bugs hide in the boundaries:
left half
right half
base case
merge step
pivot position
strip condition
A fast algorithm is not useful unless it computes the right result.
How do we prove that a divide-and-conquer algorithm is correct?
Solution
Use induction on input size.
Most divide-and-conquer correctness proofs follow the same pattern:
- Prove the base case.
- Assume recursive calls are correct for smaller inputs.
- Prove the combine step produces a correct answer for the current input.
This is structural induction over the recursive decomposition.
The algorithm does not need to be complicated. The proof must match the exact shape of the recursion.
General Proof Template
For an algorithm:
Solve(input)
if input is small
return direct answer
split input into subproblems
leftAnswer = Solve(left)
rightAnswer = Solve(right)
return Combine(leftAnswer, rightAnswer)
the correctness proof usually has this form:
Claim:
Solve returns the correct answer for every input of size n.
Proof:
By induction on n.
Base case:
For small inputs, Solve computes the answer directly.
Therefore it is correct.
Inductive hypothesis:
Assume Solve is correct for all inputs smaller than n.
Inductive step:
Consider an input of size n.
The algorithm splits it into smaller subproblems.
By the inductive hypothesis, recursive answers are correct.
It remains to show that Combine produces the correct full answer.
The important work is usually in the last sentence.
Example: Merge Sort
Claim:
MergeSort returns a sorted permutation of the input.
This claim has two parts:
- The output is sorted.
- The output contains exactly the same elements as the input.
Both must be proved.
Merge Sort Base Case
If the input has length 0 or 1:
[]
[x]
it is already sorted.
It is also trivially a permutation of itself.
So the base case is correct.
Merge Sort Inductive Step
Assume merge sort correctly sorts all arrays smaller than length (n).
For an array of length (n), split it into:
left
right
Both are smaller than the original array.
By the inductive hypothesis:
MergeSort(left)
returns a sorted permutation of left.
MergeSort(right)
returns a sorted permutation of right.
Now the merge procedure receives two sorted arrays.
If Merge correctly combines two sorted arrays into one sorted array containing exactly all their elements, then the full output is a sorted permutation of the original input.
Therefore merge sort is correct.
The Merge Lemma
The proof depends on a smaller lemma:
If A and B are sorted,
then Merge(A, B) returns a sorted permutation
of A concatenated with B.
This is typical.
Divide-and-conquer proofs often reduce global correctness to one local combine lemma.
For merge sort, the recursive structure is easy. The merge lemma carries the proof.
Proving the Merge Lemma
During merge, maintain this invariant:
result contains the smallest elements already consumed
from A and B, in sorted order.
At every step:
- Compare the first unconsumed element of
AandB. - Append the smaller one.
- Preserve sorted order.
- Preserve element counts.
When one side is exhausted, append the remaining sorted suffix.
The result is sorted and contains all elements exactly once.
That proves the merge lemma.
Example: Quick Sort
Claim:
QuickSort sorts the array in place.
The proof depends on the partition operation.
Partition must guarantee:
all elements left of pivot <= pivot
pivot is in final position
all elements right of pivot >= pivot
Once that is true, the recursive proof is straightforward.
Quick Sort Proof
Base case:
array length 0 or 1
is already sorted.
Inductive hypothesis:
QuickSort correctly sorts all arrays smaller than n.
Inductive step:
Partition places the pivot in its final sorted position.
The left partition contains only values less than or equal to the pivot.
The right partition contains only values greater than or equal to the pivot.
Both partitions are smaller than the original array.
By the inductive hypothesis, recursive calls sort both partitions.
Since every left element is no greater than the pivot, and every right element is no smaller than the pivot, the whole array is sorted.
Partition Invariant
For Lomuto partition:
func partition(nums []int, low, high int) int {
pivot := nums[high]
i := low
for j := low; j < high; j++ {
if nums[j] < pivot {
nums[i], nums[j] = nums[j], nums[i]
i++
}
}
nums[i], nums[high] = nums[high], nums[i]
return i
}
The loop invariant is:
nums[low:i] contains values < pivot
nums[i:j] contains values >= pivot
nums[j:high] has not been inspected
nums[high] is the pivot
When the loop ends:
nums[low:i] contains values < pivot
nums[i:high] contains values >= pivot
Swapping nums[i] with the pivot places the pivot between the two regions.
That proves partition correctness.
Example: Binary Search
Binary search is a divide-and-conquer algorithm over a sorted search space.
Claim:
BinarySearch returns the index of target if present,
otherwise it reports absence.
The key invariant:
If target exists, it lies inside the current search interval.
At each step:
- Choose
mid. - If
nums[mid] == target, return. - If
nums[mid] < target, discard the left half. - If
nums[mid] > target, discard the right half.
Because the array is sorted, the discarded half cannot contain the target.
When the interval becomes empty, the target is absent.
Example: Closest Pair
Closest-pair correctness has three cases.
The closest pair is either:
- entirely in the left half
- entirely in the right half
- crossing the dividing line
The recursive calls correctly handle the first two cases.
The strip scan handles the third case.
The correctness proof must show:
Every crossing pair closer than d is included in the strip.
where:
d = min(leftDistance, rightDistance)
If a point is farther than d from the dividing line, any crossing pair involving it has distance at least d.
So it cannot improve the answer.
The strip scan then checks enough nearby y-ordered candidates to find any closer crossing pair.
Therefore the algorithm finds the global closest pair.
Common Structure
Most divide-and-conquer correctness proofs have two components.
| Component | Question |
|---|---|
| Recursive correctness | Are smaller subproblems solved correctly? |
| Combine correctness | Does the merge, partition, or crossing step preserve the full answer? |
The recursive part is usually handled by induction.
The combine part is usually handled by an invariant, lemma, or case analysis.
Strong Induction
Divide-and-conquer proofs often use strong induction.
Instead of assuming correctness only for size:
$$ n-1 $$
we assume correctness for all smaller sizes:
$$ 1,2,\ldots,n-1 $$
This fits algorithms that split into arbitrary smaller subproblems.
Example:
QuickSort may split into sizes 0 and n-1,
or 3 and n-4,
or roughly n/2 and n/2.
Strong induction covers all of these cases cleanly.
Termination
Correctness also requires termination.
For divide-and-conquer algorithms, prove that every recursive call receives a strictly smaller input.
Examples:
Merge sort:
n -> n/2 and n/2
Quick sort:
n -> left size and right size,
excluding the pivot
Binary search:
interval length decreases
If a recursive call can receive the same input size, the algorithm may not terminate.
This is a common source of bugs in quick sort and binary search.
Boundary Conditions
Many divide-and-conquer bugs are boundary bugs.
Check:
empty input
one element
two elements
odd length
even length
duplicate values
all equal values
already sorted input
reverse sorted input
For proof work, these cases are not optional. They often reveal that the base case or partition boundary is wrong.
Proof by Invariant
When the combine step is iterative, use a loop invariant.
Examples:
- merge procedure
- partition procedure
- binary search loop
- strip scan
- heap merge
A good invariant states what is already known after each loop iteration.
It should be strong enough to imply correctness when the loop finishes.
Proof by Case Analysis
When the algorithm partitions the possible answer, use case analysis.
Examples:
closest pair:
left, right, crossing
quickselect:
target rank left of pivot,
equal to pivot,
right of pivot
binary search:
target less than mid,
equal to mid,
greater than mid
The proof must show that every possible answer belongs to one of the cases, and that the algorithm handles each case correctly.
Proof by Reduction to a Lemma
When a subroutine is central, isolate it.
Examples:
MergeSort correctness depends on Merge correctness.
QuickSort correctness depends on Partition correctness.
FFT correctness depends on even-odd decomposition.
Karatsuba correctness depends on the middle-term identity.
Prove the helper once.
Then use it in the main proof.
This keeps large correctness arguments readable.
Common Mistakes
Proving Only Sortedness
For sorting algorithms, sortedness is not enough.
An algorithm that returns:
[]
is sorted, but wrong.
Also prove that the output is a permutation of the input.
Ignoring the Combine Step
The recursive calls may be correct, but the merge step can still lose elements, double-count values, or miss crossing cases.
Weak Invariants
An invariant such as:
some elements are sorted
is rarely useful.
State exactly which elements, where they came from, and what property they satisfy.
Missing Termination
A proof that assumes recursion finishes is incomplete.
Show that every recursive call makes progress.
Assuming Balanced Splits
Correctness usually does not require balanced splits.
Quick sort remains correct even with terrible pivots.
Balance affects performance, not correctness.
Testing as Proof Support
Testing does not replace proof, but it supports proof development.
For divide-and-conquer algorithms, test against a brute force implementation.
Example:
func TestCountInversions(t *testing.T) {
for trial := 0; trial < 10000; trial++ {
nums := randomSmallArray()
got := CountInversions(nums)
want := CountInversionsBrute(nums)
if got != want {
t.Fatalf("got %d, want %d, nums=%v",
got, want, nums)
}
}
}
Small randomized tests are effective because brute force is simple and trusted.
They often expose boundary errors before formal reasoning begins.
Discussion
Correctness proofs for divide-and-conquer algorithms are usually not mysterious. They are disciplined induction arguments. The base case handles small inputs. The inductive hypothesis handles recursive calls. The real work lies in proving that the combine step neither loses information nor introduces invalid information.
Once you recognize this structure, proofs become reusable. Merge sort, quick sort, binary search, closest pair, inversion counting, and selection all follow the same basic pattern. The details differ, but the proof skeleton remains stable: establish progress, assume correctness on smaller inputs, prove the local step, and conclude correctness for the whole input.