TAOCP 2.2.2 Exercise 12

Let $a_1,a_2,\ldots,a_m$ be a sequence of insertions as in exercise 9, with $n=2$.

Section 2.2.2: Sequential Allocation

Exercise 12. [M28] The advantage of having two tables coexist in memory by growing towards each other, rather than by having them kept in separate independently bounded areas, may be quantitatively estimated (to a certain extent) as follows. Use the model of exercise 9 with $n = 2$: for each of the $2^m$ equally probable sequences $a_1, a_2, \ldots, a_m$, let there be $k_1$ 1s and $k_2$ 2s. (Here $k_1$ and $k_2$ are the respective sizes of the two tables after the memory is full. We are able to run the algorithm with $m = k_1 + k_2$ locations when the tables are adjacent, instead of $2\max(k_1, k_2)$ locations to get the same effect with separate tables.)

What is the average value of $\max(k_1, k_2)$?

Verified: no
Solve time: -


Setup

Let $a_1,a_2,\ldots,a_m$ be a sequence of insertions as in exercise 9, with $n=2$. Since each $a_j$ is independently equal to $1$ or $2$ with probability $1/2$, all $2^m$ sequences are equally probable.

For a given sequence, let $k_1$ be the number of $1$s and $k_2$ the number of $2$s. Since every insertion goes into exactly one of the two tables,

$$ k_1+k_2=m. $$

The quantity to be determined is the average value of

$$ \max(k_1,k_2). $$

Because $k_2=m-k_1$,

$$ \max(k_1,k_2) = \frac{k_1+k_2+|k_1-k_2|}{2} = \frac{m+|k_1-k_2|}{2}. $$

Hence it suffices to determine the expected value of $|k_1-k_2|$.

Solution

Since $k_1+k_2=m$,

$$ k_1-k_2=2k_1-m. $$

The random variable $k_1$ has binomial distribution,

$$ \Pr(k_1=r)=\frac{\binom{m}{r}}{2^m}, \qquad 0\le r\le m. $$

Therefore

$$ E\bigl(\max(k_1,k_2)\bigr) = \frac{m}{2} + \frac12,E(|k_1-k_2|), $$

and

$$ E(|k_1-k_2|) = \sum_{r=0}^{m} |2r-m| \frac{\binom{m}{r}}{2^m}. $$

Thus

$$ E\bigl(\max(k_1,k_2)\bigr) = \frac{m}{2} + \frac{1}{2^{m+1}} \sum_{r=0}^{m} |2r-m| \binom{m}{r}. \tag{1} $$

This expression may be simplified.

First suppose $m=2s$ is even. Then

$$ |2r-m|=2|r-s|, $$

so by symmetry of the binomial coefficients,

$$ \sum_{r=0}^{2s} |2r-2s| \binom{2s}{r} = 4\sum_{r=s+1}^{2s} (r-s)\binom{2s}{r}. $$

Using the identity

$$ r\binom{2s}{r} = 2s\binom{2s-1}{r-1}, $$

we obtain

$$ \begin{aligned} \sum_{r=s+1}^{2s} (r-s)\binom{2s}{r} &= \sum_{r=s+1}^{2s} r\binom{2s}{r} - s\sum_{r=s+1}^{2s} \binom{2s}{r} \[6pt] &= 2s\sum_{r=s+1}^{2s} \binom{2s-1}{r-1} - s,2^{2s-1}. \end{aligned} $$

Since

$$ \sum_{r=s+1}^{2s} \binom{2s-1}{r-1} = \sum_{t=s}^{2s-1} \binom{2s-1}{t} = 2^{2s-2}, $$

it follows that

$$ \sum_{r=s+1}^{2s} (r-s)\binom{2s}{r} = s,2^{2s-1} - s,2^{2s-1} + s\binom{2s-1}{s} = s\binom{2s-1}{s}. $$

Hence

$$ \sum_{r=0}^{2s} |2r-2s| \binom{2s}{r} = 4s\binom{2s-1}{s} = 2s\binom{2s}{s}, $$

because

$$ \binom{2s}{s} = 2\binom{2s-1}{s}. $$

Substituting into (1),

$$ E\bigl(\max(k_1,k_2)\bigr) = s+\frac{s}{2^{2s}}\binom{2s}{s}. $$

Therefore, when $m=2s$,

$$ \boxed{ E\bigl(\max(k_1,k_2)\bigr) = \frac{m}{2} + \frac{m}{2^{m+1}} \binom{m}{m/2} }. $$

Now suppose $m=2s+1$ is odd. Then

$$ |2r-(2s+1)| = 2\left|r-s-\tfrac12\right|. $$

Symmetry gives

$$ \sum_{r=0}^{2s+1} |2r-(2s+1)| \binom{2s+1}{r} = 4\sum_{r=s+1}^{2s+1} \left(r-s-\tfrac12\right) \binom{2s+1}{r}. $$

The same calculation yields

$$ \sum_{r=0}^{2s+1} |2r-(2s+1)| \binom{2s+1}{r} = (2s+1)\binom{2s}{s}. $$

Substituting into (1),

$$ E\bigl(\max(k_1,k_2)\bigr) = \frac{2s+1}{2} + \frac{2s+1}{2^{2s+2}} \binom{2s}{s}. $$

Since $m=2s+1$,

$$ \boxed{ E\bigl(\max(k_1,k_2)\bigr) = \frac{m}{2} + \frac{m}{2^{m+1}} \binom{m-1}{(m-1)/2} }. $$

Combining both cases,

$$ \boxed{ E\bigl(\max(k_1,k_2)\bigr) = \frac{m}{2} + \frac{\lfloor (m+1)/2\rfloor}{2^m} \binom{m}{\lfloor m/2\rfloor} }. $$

Verification

Take $m=2$. The four equally probable sequences are

$$ 11,\ 12,\ 21,\ 22. $$

Their values of $\max(k_1,k_2)$ are

$$ 2,\ 1,\ 1,\ 2. $$

The average is

$$ \frac{2+1+1+2}{4} = \frac32. $$

The formula gives

$$ \frac{2}{2} + \frac{2}{2^3}\binom{2}{1} = 1+\frac12 = \frac32. $$

Take $m=3$. The values are $3$ for $111$ and $222$, and $2$ for the remaining six sequences. Hence

$$ \frac{2\cdot 3+6\cdot 2}{8} = \frac94. $$

The formula gives

$$ \frac32 + \frac{3}{2^4}\binom{2}{1} = \frac32+\frac34 = \frac94. $$

The results agree.

Notes

Since the adjacent-table method requires only $m=k_1+k_2$ locations, while separate bounded areas require $2\max(k_1,k_2)$ locations, the average memory requirement for separate areas is

$$ 2E\bigl(\max(k_1,k_2)\bigr) = m + \frac{2\lfloor (m+1)/2\rfloor}{2^m} \binom{m}{\lfloor m/2\rfloor}. $$

By Stirling's formula,

$$ \binom{m}{\lfloor m/2\rfloor} \sim \frac{2^m}{\sqrt{\pi m/2}}, $$

so the excess over $m$ grows asymptotically like

$$ \sqrt{\frac{m}{2\pi}}. $$

Thus the advantage of adjacent growth increases on the order of $\sqrt{m}$.