TAOCP 1.2.1 Exercise 14

Section 1.2.1: Mathematical Induction

Exercise 14. [50] (R. W. Floyd.) Prepare a computer program that accepts, as input, programs in some programming language together with optional assertions, and that attempts to fill in the remaining assertions necessary to make a proof that the computer program is valid. (For example, strive to get a program that is able to prove the validity of Algorithm E, given only assertions A1, A4, and A6. See the papers by R. W. Floyd and J. C. King in the IFIP Congress proceedings, 1971, for further discussion.)

Verified: yes
Solve time: 37s

Setup

Exercise 14 does not ask for a mathematical theorem to be proved by induction. Instead, it asks for the design of a program that accepts

a program in some programming language,
a collection of optional assertions supplied by the user,

and attempts to infer the remaining assertions needed for a proof of correctness in the style illustrated by Algorithm E and the assertions of Fig. 4.

The essential problem is therefore to mechanize Floyd's method of inductive assertions. Given a flowchart or program, certain assertions are known; the program must determine assertions at the remaining program points so that every program step preserves the asserted conditions.

The exercise asks only for a program design. A complete implementation is not required.

Solution

Choose as input language a simple flowchart language consisting of assignments, conditional branches, and loops. Associate with every program point $v$ an assertion $A_v$.

Some assertions may be supplied by the user. The remaining assertions are initially unknown.

For every edge

$$ v \longrightarrow w $$

of the flowchart, compute the weakest condition on entry to $v$ that guarantees the assertion at $w$ after execution of the corresponding statement.

If the edge corresponds to the assignment

$$ x \leftarrow E, $$

and the assertion at $w$ is $A_w$, the required condition at $v$ is obtained by substituting $E$ for every free occurrence of $x$ in $A_w$. Thus

$$ A_v \supseteq A_w[x \leftarrow E]. $$

If the edge corresponds to a branch whose condition is $B$, the required implication is

$$ A_v \land B \Rightarrow A_w $$

for the true branch and

$$ A_v \land \neg B \Rightarrow A_w $$

for the false branch.

The program represents each unknown assertion by a symbolic formula containing unknown predicates. Traversing the control-flow graph repeatedly, it generates all implications required by the program structure. These implications form a system of logical constraints.

The inference phase consists of solving the constraint system. Whenever a program point has several predecessors $u_1,\ldots,u_k$, the assertion at that point must satisfy all incoming constraints. Hence the candidate assertion is strengthened until

$$ A_{u_i} \Rightarrow \operatorname{pre}_{u_i,w}(A_w) \qquad (1\le i\le k), $$

where $\operatorname{pre}_{u_i,w}$ denotes the appropriate predecessor transformation.

The process continues until a fixed point is reached. At that stage every edge of the flowchart satisfies its verification condition.

For Algorithm E, suppose only assertions $A1$, $A4$, and $A6$ are supplied. The unknown assertions $A2$, $A3$, and $A5$ are represented symbolically. Propagation of conditions forward and backward through steps E1-E4 generates constraints relating the variables

$$ a,a',b,b',c,d,m,n. $$

The fixed-point computation discovers the invariant

$$ a'm+b'n=c, \qquad am+bn=d, $$

together with the positivity conditions on $c$ and $d$. These are precisely the assertions needed for the inductive proof of correctness. Once they have been inferred, every transition E1 $\rightarrow$ E2, E2 $\rightarrow$ E3, E3 $\rightarrow$ E4, and E4 $\rightarrow$ E2 satisfies its verification condition, yielding a complete correctness proof.

Thus the desired program consists of:

A parser that converts the source program into a control-flow graph.
A symbolic assertion generator.
A weakest-precondition or verification-condition generator.
A constraint solver that computes assertions satisfying all generated implications.
A verifier that checks every edge after the assertions have been inferred.

This realizes Floyd's method mechanically and attempts to fill in omitted assertions automatically.

Verification

Let $A_v$ denote the assertion attached to program point $v$.

For each assignment edge, correctness requires that execution of the assignment transform states satisfying $A_v$ into states satisfying $A_w$. Substitution produces exactly the weakest such requirement.

For each branch edge, correctness requires that every state satisfying the predecessor assertion and the branch condition satisfy the successor assertion. The generated implication expresses precisely this requirement.

If all generated implications are valid, every execution path preserves the assertions attached to successive program points. By induction on the number of executed program steps, every reachable program point satisfies its associated assertion.

When the initial assertion and final assertion are among the supplied conditions, the inferred intermediate assertions complete a Floyd-style inductive proof of program correctness.

Notes

The problem is historically significant because it anticipates modern verification systems. Floyd's method of inductive assertions later evolved into verification-condition generators and weakest-precondition calculi. Exercise 14 asks for an automated discovery mechanism for invariants, which remains one of the central problems in program verification.