1.3 Correctness Arguments

A correctness argument explains why an algorithm returns an acceptable output for every valid input. It should be written before, or at least alongside, the implementation. Code tells you what the algorithm does. A correctness argument tells you why those steps solve the stated problem.

A good correctness argument is local. It does not rely on intuition about the whole program at once. Instead, it identifies a property that remains true as the algorithm runs. This property is called an invariant. The algorithm may change variables, move pointers, update tables, or recurse into smaller inputs, but the invariant gives those changes a stable meaning.

Problem

You have an algorithm that seems to work on examples, but you need a disciplined way to prove that it always works.

Solution

Use a three-part correctness argument:

1. State the invariant.
2. Prove that the algorithm preserves the invariant.
3. Prove that when the algorithm stops, the invariant implies the required output.

For loops, also show that the loop eventually terminates. For recursion, show that each recursive call is made on a smaller problem.

Example: Maximum Element

Consider the standard algorithm for finding the maximum value in a nonempty array.

max_value(A):
  best = A[0]

  for i = 1 to n - 1:
    if A[i] > best:
      best = A[i]

  return best

The key invariant is:

At the start of each iteration with index i,
best is the maximum value in A[0..i-1].

At initialization, before the first iteration, best = A[0]. The prefix A[0..0] contains only one element, so the invariant holds.

During an iteration, the algorithm compares A[i] with best. If A[i] is larger, it becomes the new best value. If not, the old best value remains the maximum. In both cases, after the iteration, best is the maximum value in A[0..i].

When the loop ends, every element has been processed. The invariant then says that best is the maximum value in A[0..n-1], which is exactly the required output.

The loop terminates because i increases by one each iteration and is bounded above by n - 1.

Example: Linear Search

Suppose you want to find whether a value x appears in an array.

contains(A, x):
  for i = 0 to n - 1:
    if A[i] == x:
      return true

  return false

A useful invariant is:

At the start of iteration i,
x does not appear in A[0..i-1].

At the start, i = 0, and the prefix A[0..-1] is empty. The statement holds vacuously.

If the loop reaches index i, then the algorithm has already checked all earlier positions and found no match. If A[i] == x, returning true is correct because the algorithm has found a witness. If A[i] != x, then x does not appear in A[0..i], so the invariant is preserved for the next iteration.

If the loop finishes without returning, the invariant says that x does not appear in the whole array. Returning false is correct.

Recipe: Writing the Argument

Use this format for most basic algorithms:

Claim:
  The algorithm returns a correct output for every valid input.

Invariant:
  State the property that remains true during the computation.

Initialization:
  Show the invariant holds before the first step.

Maintenance:
  Show one step of the algorithm preserves the invariant.

Termination:
  Show the algorithm stops.

Conclusion:
  Show that the invariant at termination implies the specification.

This form is intentionally repetitive. The repetition is useful because most mistakes appear in one of the five checks. Either the invariant is too weak, the update does not preserve it, or the final state does not imply the desired output.

Choosing the Right Invariant

An invariant should connect program state to the problem statement. Variables alone are not enough. Saying “best has been updated so far” is vague. Saying “best is the maximum of the processed prefix” is useful because it identifies exactly what best means.

For array scans, invariants often refer to a processed prefix:

After processing positions 0 through i,
the maintained value equals the answer for that prefix.

For two-pointer algorithms, invariants often describe the eliminated region:

No valid solution exists outside the remaining search interval.

For graph searches, invariants often describe discovered and undiscovered vertices:

Every marked vertex is reachable from the source.

For dynamic programming, invariants often describe completed table entries:

Each filled entry stores the optimal value for its corresponding subproblem.

Common Pitfalls

An invariant that restates the goal without explaining progress is usually too strong or unhelpful. For example, “the algorithm will return the maximum” does not explain why each update is safe.

An invariant that only describes syntax is too weak. For example, “i increases” helps prove termination, but it does not prove correctness.

A proof that checks only examples is testing, not correctness. Examples can reveal bugs, but they cannot justify all inputs.

A proof that ignores termination is incomplete. An algorithm that never returns does not produce a correct output, even if its intermediate state has useful properties.

Takeaway

A correctness argument turns an algorithm into a justified construction. State what each piece of program state means, prove that every step preserves that meaning, and connect the final state back to the required output.