TAOCP 1.4.3.2 Exercise 7

Section 1.4.3.2: Trace Routines

Exercise 7. [25] Discuss how to write an efficient jump trace routine, which emits much less output than a normal trace. Instead of displaying the register contents, a jump trace simply records the jumps that occur. It outputs a sequence of pairs $(x_1, y_1)$, $(x_2, y_2)$, ... , meaning that the program jumped from location $x_1$ to $y_1$, then (after performing the instructions in locations $y_1, y_1 + 1, ... , x_2$) it jumped from $x_2$ to $y_2$, etc. [From this information it is possible for a subsequent routine to reconstruct the flow of the program and to deduce how frequently each instruction was performed.]

Verified: no
Solve time: -

Solution

A jump trace routine should retain only the information necessary to reconstruct the control flow of the program. The ordinary trace routine records register contents after every instruction, but the discussion preceding this exercise shows that the essential complication in tracing lies in handling transfers of control. Since every non-jump instruction advances sequentially, it suffices to record only the jumps that actually occur.

The desired output is a sequence of pairs

$$ (x_1,y_1),\ (x_2,y_2),\ \ldots, $$

where the execution jumped from location $x_i$ to location $y_i$. Between two recorded jumps, execution proceeds consecutively through

$$ y_i,\ y_i+1,\ \ldots,\ x_{i+1}, $$

except when conditional jumps fail, since a failed conditional jump is indistinguishable from an ordinary instruction and need not be recorded.

The trace routine in the text already isolates jump instructions. Lines 27-30 replace every jump instruction by a transfer to JUMP, and lines 40-47 determine the target address and continue execution. Hence a jump trace routine can be obtained by simplifying the ordinary trace routine so that control is intercepted only at JUMP.

The main invariant is this: immediately before line 46, the simulated program counter equals the source location $x$ of the jump, and the A-register contains the destination address $y$. Therefore the pair $(x,y)$ can be output at that moment.

The routine may be organized as follows.

The traced program is executed normally, except that every jump instruction is modified exactly as in the given trace routine: its address field is replaced by JUMP. No register dump is performed after ordinary instructions, and no attempt is made to preserve all registers, since the only required information is the transfer of control.

At JUMP, the routine first determines whether the jump actually occurred. Conditional jumps that fail merely continue at the next instruction and should produce no output. The mechanism in lines 40-47 already distinguishes these cases. If the jump is not taken, execution resumes sequentially.

If the jump is taken, the routine writes the pair

$$ (x,y) $$

to an output buffer, where $x$ is obtained from PREG(0:2) and $y$ is the destination address in the A-register after execution of the modified jump. After recording the pair, execution resumes at location $y$.

The resulting control structure is:

Execute non-jump instructions directly.
Replace each jump instruction by a transfer to JUMP.
At JUMP, determine whether the jump was taken.
If not taken, continue with the next sequential instruction.
If taken, output

$$ (PREG(0:2), rA), $$

then continue execution at the address in $rA$.

This method is efficient in both time and space. The trace output is proportional to the number of successful jumps rather than to the total number of instructions executed. Since ordinary programs contain long stretches of sequential execution punctuated by comparatively few transfers of control, the reduction in output may be substantial.

A subsequent routine can reconstruct the dynamic execution path from the sequence of pairs. Suppose the recorded sequence is

$$ (x_1,y_1),\ (x_2,y_2),\ \ldots,\ (x_n,y_n). $$

Then execution began at the initial location, proceeded sequentially until $x_1$, jumped to $y_1$, proceeded through

$$ y_1,\ y_1+1,\ \ldots,\ x_2, $$

jumped to $y_2$, and so on. Consequently the frequency with which each instruction was executed can be computed by counting how often each interval

$$ [y_i,x_{i+1}] $$

contains the instruction in question.

The principal economy comes from eliminating all information except the essential discontinuities in program flow. Since the ordinary trace routine spends most of its time writing voluminous register information, and since output dominates the running time, a jump trace routine can be substantially faster while requiring much less storage for the trace.