20.24 Building a Verified Algorithm Sketch

Design a small workflow for checking an algorithm's correctness argument against its implementation.

20.24 Building a Verified Algorithm Sketch

Problem

Design a small workflow for checking an algorithm's correctness argument against its implementation.

Most algorithm implementations are tested. Fewer are verified. Tests show that an algorithm works on selected inputs. Verification tries to show that an algorithm satisfies a specification for all valid inputs.

A full formal proof may require a proof assistant or a verification-aware language. This section uses a lighter approach: write a precise specification, expose the loop invariant, add executable assertions, and connect the code to a proof sketch.

The target is not a complete formal verification system. The target is a repeatable engineering pattern:

specification
  -> invariant
  -> implementation
  -> assertions
  -> tests
  -> proof sketch

We will use binary search because it is small, common, and easy to get subtly wrong.

Solution

Implement binary search with an explicit contract:

Given a sorted array and a target, return an index whose value equals the target, or return -1 if the target is absent.

The implementation:

def binary_search(values, target):
    left = 0
    right = len(values)

    while left < right:
        middle = left + (right - left) // 2

        if values[middle] < target:
            left = middle + 1
        elif values[middle] > target:
            right = middle
        else:
            return middle

    return -1

This version uses a half-open search interval:

[left, right)

That convention avoids many off-by-one errors.

Writing the Specification

A useful specification says what must be true before and after the function runs.

Precondition:

values is sorted in nondecreasing order

Postcondition:

If the function returns an index i, then values[i] == target.
If the function returns -1, then target does not occur in values.

Executable helpers:

def is_sorted(values):
    return all(
        values[i] <= values[i + 1]
        for i in range(len(values) - 1)
    )

def contains(values, target):
    return any(value == target for value in values)

Postcondition checker:

def check_binary_search_result(values, target, result):
    if result == -1:
        return not contains(values, target)

    return (
        0 <= result < len(values)
        and values[result] == target
    )

This is not yet a proof, but it makes the intended behavior concrete and testable.

Adding Runtime Contract Checks

Add assertions around the implementation.

def checked_binary_search(values, target):
    assert is_sorted(values)

    result = binary_search(values, target)

    assert check_binary_search_result(
        values,
        target,
        result,
    )

    return result

Contract checks are useful during development. They catch mistakes in both the implementation and the assumptions used by callers.

Do not confuse them with a proof. Assertions run on particular executions. A proof covers all valid executions.

Finding the Loop Invariant

A loop invariant is a condition that holds:

  1. Before the loop starts.
  2. After every loop iteration.
  3. At loop exit, strong enough to prove the result.

For binary search with interval [left, right), the key invariant is:

If target occurs in values, then at least one occurrence is inside values[left:right].

Equivalently:

All positions before left contain values smaller than target.
All positions at or after right contain values greater than target.

For sorted arrays:

for all i < left: values[i] < target
for all i >= right: values[i] > target

The active search interval is the only place where the target can still appear.

Executable Invariant

Write the invariant as code.

def binary_search_invariant(values, target, left, right):
    if not (0 <= left <= right <= len(values)):
        return False

    for i in range(0, left):
        if values[i] >= target:
            return False

    for i in range(right, len(values)):
        if values[i] <= target:
            return False

    return True

Now instrument the loop:

def binary_search_with_invariant_checks(values, target):
    assert is_sorted(values)

    left = 0
    right = len(values)

    assert binary_search_invariant(values, target, left, right)

    while left < right:
        middle = left + (right - left) // 2

        if values[middle] < target:
            left = middle + 1
        elif values[middle] > target:
            right = middle
        else:
            assert values[middle] == target
            return middle

        assert binary_search_invariant(values, target, left, right)

    assert left == right
    assert not contains(values, target)

    return -1

This version checks the reasoning at runtime. If an update rule breaks the invariant, the assertion fails immediately.

Proving Initialization

At the start:

left = 0
right = len(values)

The interval is the whole array:

[0, len(values))

There are no positions before left, and no positions at or after right.

Therefore both invariant clauses hold vacuously:

for all i < 0: values[i] < target
for all i >= len(values): values[i] > target

Because those sets of indices are empty, the invariant holds before the loop.

This is the initialization step.

Proving Maintenance

Assume the invariant holds at the start of an iteration and left < right.

Compute:

middle = left + (right - left) // 2

So:

left <= middle < right

There are three cases.

Case 1:

values[middle] < target

Because the array is sorted, every index i <= middle has:

values[i] <= values[middle] < target

So no target can occur at or before middle. Updating:

left = middle + 1

preserves the invariant.

Case 2:

values[middle] > target

Because the array is sorted, every index i >= middle has:

values[i] >= values[middle] > target

So no target can occur at or after middle. Updating:

right = middle

preserves the invariant.

Case 3:

values[middle] == target

The function returns middle, and the postcondition holds immediately.

This is the maintenance step.

Proving Termination

The loop condition is:

while left < right:

The interval length is:

right - left

At every non-returning iteration, the interval becomes strictly smaller.

If values[middle] < target, then:

left becomes middle + 1

Since middle >= left, the new left is larger.

If values[middle] > target, then:

right becomes middle

Since middle < right, the new right is smaller.

The interval length is a nonnegative integer and strictly decreases. Therefore the loop terminates.

Proving the Failure Result

If the loop exits, then:

left == right

The invariant says:

all indices before left are too small
all indices at or after right are too large

Since left == right, every index is either before left or at or after right.

Therefore no index can contain target.

So returning:

return -1

satisfies the postcondition.

Property-Based Testing

The proof sketch explains correctness. Tests still matter because the implementation may not match the proof.

A simple randomized test:

import random

def test_binary_search_randomized():
    for _ in range(1000):
        values = sorted(
            random.randint(0, 100)
            for _ in range(random.randint(0, 50))
        )

        target = random.randint(0, 100)

        result = binary_search_with_invariant_checks(
            values,
            target,
        )

        assert check_binary_search_result(
            values,
            target,
            result,
        )

This test exercises empty arrays, duplicate values, missing targets, and present targets.

Testing Edge Cases

Write focused tests for boundary behavior.

def test_empty_array():
    assert binary_search([], 10) == -1
def test_single_present():
    assert binary_search([10], 10) == 0
def test_single_absent():
    assert binary_search([10], 5) == -1
def test_duplicates():
    values = [1, 2, 2, 2, 3]
    result = binary_search(values, 2)

    assert result in {1, 2, 3}

The specification for duplicates says the function may return any matching index. If you require the first matching index, that is a different specification.

Refining the Specification

Suppose the desired behavior is:

return the first index whose value equals target

Then the previous implementation is not enough. It may return any matching occurrence.

A first-position binary search:

def lower_bound(values, target):
    left = 0
    right = len(values)

    while left < right:
        middle = left + (right - left) // 2

        if values[middle] < target:
            left = middle + 1
        else:
            right = middle

    return left

Use it for first occurrence:

def binary_search_first(values, target):
    index = lower_bound(values, target)

    if index < len(values) and values[index] == target:
        return index

    return -1

Now the postcondition changes.

def check_first_result(values, target, result):
    if result == -1:
        return not contains(values, target)

    return (
        0 <= result < len(values)
        and values[result] == target
        and all(values[i] != target for i in range(result))
    )

This illustrates an important verification rule: the code can be correct only relative to a precise specification.

Invariant for Lower Bound

For lower_bound, the invariant is slightly different.

All positions before left contain values < target.
All positions at or after right contain values >= target.

Executable invariant:

def lower_bound_invariant(values, target, left, right):
    if not (0 <= left <= right <= len(values)):
        return False

    for i in range(0, left):
        if values[i] >= target:
            return False

    for i in range(right, len(values)):
        if values[i] < target:
            return False

    return True

The boundary changes from > to >=. That small difference is exactly the kind of detail verification makes visible.

Reference Implementation

A slow, obvious implementation is useful as a specification oracle.

def linear_search_first(values, target):
    for index, value in enumerate(values):
        if value == target:
            return index

    return -1

Test equivalence:

def test_first_matches_linear_search():
    for _ in range(1000):
        values = sorted(
            random.randint(0, 20)
            for _ in range(random.randint(0, 50))
        )

        target = random.randint(0, 20)

        assert binary_search_first(values, target) == linear_search_first(
            values,
            target,
        )

A reference implementation is usually too slow for production but excellent for testing and verification scaffolding.

Moving Toward Formal Verification

Runtime assertions and proof sketches are useful, but they do not mechanically prove correctness.

To move toward formal verification, encode:

  1. The data type.
  2. The sortedness predicate.
  3. The algorithm.
  4. The postcondition.
  5. The termination argument.
  6. The invariant.

In a proof assistant, the invariant and postcondition become formal statements. The proof checker validates every step.

Even without full mechanization, this workflow improves ordinary code because it forces:

  • explicit preconditions
  • explicit postconditions
  • loop invariants
  • termination arguments
  • edge-case coverage
  • reference-oracle testing

When Verification Is Worth It

Verification effort is justified when failure cost is high or the algorithm is reused heavily.

Good candidates:

Algorithm Why verification helps
Binary search variants Off-by-one errors are common
Parsers Invalid inputs can break assumptions
Schedulers Invariants are subtle
Consensus protocols Failure cost is high
Cryptographic code Correctness and side conditions matter
Memory allocators Bugs are severe
Financial calculations Incorrect results have direct cost
Compilers Transformations must preserve meaning

Not every function needs verification. The value is highest where the state space is large and tests alone are weak.

Testing the Proof Sketch

A proof sketch can also be wrong. Challenge it with mutations.

For example, introduce a bug:

def broken_binary_search(values, target):
    left = 0
    right = len(values)

    while left < right:
        middle = left + (right - left) // 2

        if values[middle] < target:
            left = middle
        elif values[middle] > target:
            right = middle
        else:
            return middle

    return -1

This can fail to terminate when:

left == middle

The termination argument catches the bug because the interval does not strictly decrease.

The invariant alone may not be enough. Correctness needs both invariant preservation and termination.

Common Bugs

The most common verification bug is writing the specification after the implementation and unconsciously matching the code rather than the desired behavior.

Another common bug is omitting the precondition. Binary search is not correct on unsorted input.

Loop invariants are often too weak. An invariant must be strong enough to prove the postcondition at loop exit.

Termination arguments are often ignored. A loop can preserve its invariant forever and still be wrong because it never exits.

Duplicate elements require precise specifications. “Find target” and “find first target” are different.

Assertions can give false confidence. They check executions, not all inputs.

Reference implementations can share the same bug if they are not independently designed.

Recipe

Build a verified algorithm sketch in layers.

Start with a precise precondition and postcondition. Write a simple implementation. Identify the loop invariant. Convert the invariant into executable assertions. Prove initialization, maintenance, and termination informally. Prove that the loop exit condition plus invariant implies the postcondition. Add randomized tests and edge-case tests. Use a slow reference implementation when available. Refine the specification when ambiguity appears. Move to a proof assistant only after the specification and invariant are stable.

The main lesson is that verification is disciplined explanation. The invariant states what the algorithm preserves. The termination argument states why it finishes. The postcondition states what the result means. Together, they turn an implementation from code that seems to work into code with a clear correctness argument.