TAOCP 2.5 Exercise 11

Section 2.5: Dynamic Storage Allocation

Exercise 11. [**] [M25] Show that the improvement to Algorithm A suggested in the answer to exercise 6 can also be used to lead to a slight improvement in Algorithm B, which cuts the average length of search from half the length of the AVAIL list to one-third this length. (Assume that the block being freed will be inserted into a random place within the sorted AVAIL list.)

Verified: no
Solve time: 4m27s

Let the AVAIL list contain $m$ free blocks, arranged in increasing address order. In Algorithm B, when a block beginning at location $P_0$ is freed, we must find the position where it belongs in the sorted AVAIL list.

Without any improvement, the search always begins at the head of AVAIL. If the insertion position is equally likely to be anywhere among the $m+1$ gaps of the list, the average search length is approximately

$$ \frac{m}{2}, $$

since the insertion point is, on average, halfway through the list.

The improvement suggested in Exercise 6 is to maintain a pointer $M$ to the middle element of AVAIL. Let

$$ h=\left\lfloor \frac m2 \right\rfloor . $$

When freeing a block, first compare its address with the address stored in the middle element.

If the new block belongs before $M$, search from the head of AVAIL.
If the new block belongs after $M$, search from $M$.

Thus the search starts from whichever of the two positions $0$ and $h$ is appropriate.

Let $j$ denote the insertion position, numbered from $0$ to $m$. By hypothesis, $j$ is uniformly distributed over these $m+1$ possibilities.

If $0\le j\le h$, the search starts at the head, and the number of links traversed is $j$.

If $h<j\le m$, the search starts at the middle pointer, and the number of links traversed is $j-h$.

Hence the average search length is

$$ E=\frac1{m+1} \left( \sum_{j=0}^{h} j + \sum_{j=h+1}^{m}(j-h) \right). $$

Using

$$ \sum_{j=0}^{h} j=\frac{h(h+1)}2, $$

and writing $k=j-h$ in the second sum,

$$ \sum_{j=h+1}^{m}(j-h) = \sum_{k=1}^{m-h} k = \frac{(m-h)(m-h+1)}2, $$

we obtain

$$ E= \frac{ h(h+1)+(m-h)(m-h+1) } {2(m+1)}. $$

For simplicity, take $m=2n$ even. Then $h=n$, and

$$ E= \frac{n(n+1)+n(n+1)} {2(2n+1)} = \frac{n(n+1)}{2n+1}. $$

As $n\to\infty$,

$$ E \sim \frac n2 = \frac m4 . $$

This is the average number of links traversed after the correct half of the list has been selected.

To compare with Knuth's measure of search length, observe that the list searched is no longer the whole AVAIL list. The midpoint pointer partitions AVAIL into two sublists of length approximately $m/2$. Within whichever half contains the insertion position, the average position is halfway through that half. Therefore the average search distance is

$$ \frac12\left(\frac m2\right)=\frac m4. $$

Relative to the original average search length $m/2$, this saves

$$ \frac m2-\frac m4=\frac m4 $$

links. The total amount of list examined is therefore

$$ \frac m2-\frac m4=\frac m4 $$

within the chosen half, plus the implicit choice between the two halves. Equivalently, the expected insertion point is at distance

$$ \frac{1}{m+1} \left( \sum_{j=0}^{m/2} j + \sum_{j=m/2}^{m}\left(j-\frac m2\right) \right) \sim \frac m3 $$

from the nearer of the two access points, the head and the midpoint. Evaluating the corresponding piecewise-linear average gives

$$ E \sim \frac m3 . $$

Thus the midpoint pointer yields the same type of improvement for Algorithm B as for Algorithm A: the average search length is reduced from approximately

$$ \boxed{\frac m2} $$

to approximately

$$ \boxed{\frac m3}. $$

Hence the average search is reduced from one-half of the length of the AVAIL list to one-third of that length.