TAOCP 3.1 Exercise 6
Since each $X_n$ lies in the set ${0,1,\ldots,m-1}$, there are only $m$ possible values.
Exercise 6. ▶ [M21] Suppose that we want to generate a sequence of integers $X_0, X_1, X_2, \ldots$ in the range $0 \le X_n < m$, using a function such that $0 \le x < m$ implies $0 \le f(x) < m$. Consider a sequence formed by the rule $X_{n+1} = f(X_n)$. (Examples are the middle-square method and Algorithm K.)
a) Show that the sequence is ultimately periodic, in the sense there exist numbers $\lambda$ and $\mu$ for which the values
$$X_0, \ X_1, \ \ldots, \ X_\mu, \ \ldots, \ X_{\lambda+\mu-1}$$
are distinct, but $X_{\lambda+\mu} = X_\mu$ when $n \ge \mu$. Find the maximum and minimum possible values of $\mu$ and $\lambda$.
b) (R. W. Floyd.) Show that there exists an $n > 0$ such that $X_n = X_{2n}$; and the smallest such value of $n$ lies in the range $\mu \le n \le \lambda + \mu$. Furthermore the value of $X_n$ is unique in the sense that if $X_n = X_{2n'}$ and $X_{n'} = X_{2n'}$, then $X_{n'} = X_n$.
c) Use the ideas of (b) to design an algorithm that calculates $\mu$ and $\lambda$ for any given function $f$ and any given $X_0$, using only $O(\mu + \lambda)$ steps and only a bounded number of memory locations.
Verified: yes
Solve time: 1m41s
Solution
(a)
Since each $X_n$ lies in the set ${0,1,\ldots,m-1}$, there are only $m$ possible values. Therefore among the $m+1$ terms
$$ X_0,X_1,\ldots,X_m $$
some value must occur at least twice. Let
$$ X_i=X_j,\qquad 0\le i<j\le m, $$
with $j$ chosen as small as possible.
Because the recurrence is deterministic,
$$ X_{i+t}=X_{j+t}\qquad (t\ge0). $$
Hence the sequence repeats with period
$$ \lambda=j-i. $$
Let
$$ \mu=i. $$
Then
$$ X_{\mu+\lambda}=X_\mu, $$
and consequently
$$ X_{\mu+r+\lambda}=X_{\mu+r}\qquad (r\ge0). $$
The choice of $j$ as the first repeated occurrence implies that
$$ X_0,X_1,\ldots,X_{\mu+\lambda-1} $$
are all distinct. Thus
$$ X_0,\ldots,X_{\mu-1} $$
form a nonrepeating tail of length $\mu$, and
$$ X_\mu,\ldots,X_{\mu+\lambda-1} $$
form a cycle of length $\lambda$. Therefore the sequence is ultimately periodic.
Since the $\mu+\lambda$ values
$$ X_0,\ldots,X_{\mu+\lambda-1} $$
are distinct and belong to a set of $m$ elements,
$$ \mu+\lambda\le m. $$
Also
$$ \lambda\ge1,\qquad \mu\ge0. $$
To maximize $\mu$, minimize $\lambda$. Since $\lambda\ge1$,
$$ \mu\le m-1. $$
This bound is attained by
$$ X_0,X_1,\ldots,X_{m-1} $$
all distinct and
$$ X_m=X_{m-1}, $$
which gives
$$ \mu=m-1,\qquad \lambda=1. $$
To minimize $\mu$, take the starting point already on a cycle. Then
$$ \mu=0. $$
To maximize $\lambda$, take $\mu=0$ and let all $m$ states lie on one cycle. Then
$$ \lambda=m. $$
To minimize $\lambda$, use a fixed point; then
$$ \lambda=1. $$
Hence
$$ \boxed{0\le\mu\le m-1}, \qquad \boxed{1\le\lambda\le m}. $$
The extreme values are attainable.
(b)
Let
$$ Y_t=X_t,\qquad Z_t=X_{2t}. $$
For $t\ge\mu$, both indices lie in the periodic part of the sequence. Let
$$ k\equiv t-\mu \pmod{\lambda}. $$
Then
$$ Y_t=X_{\mu+k}, $$
while
$$ Z_t=X_{2t} =X_{\mu+\bigl((2t-\mu)\bmod\lambda\bigr)}. $$
Thus $Y_t=Z_t$ exactly when
$$ t-\mu\equiv 2t-\mu \pmod{\lambda}, $$
that is,
$$ t\equiv0\pmod{\lambda}. $$
Among multiples of $\lambda$, the first one not smaller than $\mu$ is
$$ n=\lambda\Bigl\lceil\frac{\mu}{\lambda}\Bigr\rceil. $$
Hence
$$ \mu\le n\le\mu+\lambda-1<\mu+\lambda, $$
so
$$ \mu\le n\le\mu+\lambda. $$
For this $n$,
$$ X_n=X_{2n}. $$
Therefore such an $n>0$ always exists.
Now suppose
$$ X_n=X_{2n},\qquad X_{n'}=X_{2n'}. $$
The equality $X_t=X_{2t}$ implies $t\ge\mu$, because all values before index $\mu$ are distinct. Once $t\ge\mu$, the preceding congruence argument shows that
$$ t\equiv0\pmod{\lambda}. $$
Hence both $n$ and $n'$ are multiples of $\lambda$ lying in the cycle. Every multiple of $\lambda$ in the periodic region corresponds to the same cycle position:
$$ X_n=X_\mu=X_{n'}. $$
Therefore the value $X_n$ is unique.
(c)
The preceding part suggests Floyd's cycle-finding method.
First determine a point on the cycle.
Set
$$ T\leftarrow f(X_0),\qquad H\leftarrow f(f(X_0)). $$
Repeatedly perform
$$ T\leftarrow f(T),\qquad H\leftarrow f(f(H)) $$
until
$$ T=H. $$
The variable $T$ advances one step at a time, while $H$ advances two. Part (b) guarantees that a meeting occurs after $O(\mu+\lambda)$ evaluations of $f$.
Having found a meeting point, determine $\mu$.
Set
$$ T\leftarrow X_0. $$
Keep $H$ at the meeting point. Repeatedly perform
$$ T\leftarrow f(T),\qquad H\leftarrow f(H), $$
counting the number of iterations. Let that count be $\mu$.
To justify this, write the meeting point as $X_{\mu+r\lambda}$. After $s$ simultaneous steps the variables contain
$$ T=X_s, \qquad H=X_{\mu+r\lambda+s}. $$
For $s<\mu$, the values $X_s$ are distinct from all later values. When $s=\mu$,
$$ H=X_{\mu+r\lambda+\mu}=X_{2\mu+r\lambda}=X_\mu, $$
because adding a multiple of $\lambda$ does not change a position on the cycle. Thus the first meeting occurs precisely at $s=\mu$.
Finally determine $\lambda$.
Set
$$ \lambda\leftarrow1, \qquad H\leftarrow f(T), $$
where $T=X_\mu$.
Repeatedly perform
$$ H\leftarrow f(H),\qquad \lambda\leftarrow\lambda+1 $$
until
$$ H=T. $$
Since $T$ lies on the cycle, this counts exactly one traversal of the cycle, hence the resulting value is $\lambda$.
The algorithm uses only a bounded number of variables: $T$, $H$, $\mu$, and $\lambda$. The first phase requires $O(\mu+\lambda)$ evaluations of $f$; the second requires $\mu$ evaluations; the third requires $\lambda$ evaluations. Therefore the total running time is
$$ O(\mu+\lambda), $$
and the memory usage is bounded independently of $\mu$ and $\lambda$.
Thus Floyd's method computes both $\mu$ and $\lambda$ using only $O(\mu+\lambda)$ steps and a bounded number of memory locations. ∎