TAOCP 2.3.4.3 Exercise 8

Define the embedding $T \subseteq T'$ for finite ordered trees as follows.

Section 2.3.4.3:

Exercise 8. [**] [M39] (J. B. Kruskal.) If $T$ and $T'$ are (finite, ordered) trees, let the notation $T \subseteq T'$ signify that $T$ can be embedded in $T'$, as in exercise 2.3.2-22. Prove that if

$$ T_1,T_2,T_3,\ldots $$

is any infinite sequence of trees, there exist integers $j<k$ such that $T_j\subseteq T_k$. (In other words, it is impossible to construct an infinite sequence of trees in which no tree contains any of the earlier trees of the sequence. This fact can be used to prove that certain algorithms must terminate.)

Some of the most instructive applications of the mathematical theory of trees to the analysis of algorithms are connected with formulas for counting how many different trees there are of various kinds. For example, if we want to know how many different oriented trees can be constructed having four indistinguishable vertices, we find that there are just 4 possibilities:

Four oriented trees

$$(1)$$

For our first enumeration problem, let us determine the number $a_n$ of structurally different oriented trees with $n$ vertices. Obviously, $a_1=1$. If $n>1$, the tree has a root and various subtrees; suppose there are $j_1$ subtrees with 1 vertex, $j_2$ with 2 vertices, etc. Then we may choose $j_k$ of the $a_k$ possible $k$-vertex trees in

$$ \binom{a_k+j_k-1}{j_k} $$

ways, since repetitions are allowed (exercise 1.2.6-60), and so we see that

$$ a_n = \sum_{j_1+2j_2+\cdots=n-1} \binom{a_1+j_1-1}{j_1} \binom{a_2+j_2-1}{j_2}\cdots \binom{a_{n-1}+j_{n-1}-1}{j_{n-1}}, \qquad n>1. \tag{2} $$

If we consider the generating function

$$ A(z)=\sum_n a_nz^n, \qquad a_0=0, $$

we find that the identity

$$ \frac{1}{(1-z^r)^a}=\sum_{j\ge0}\binom{a+j-1}{j}z^{rj} $$

together with (2) implies

$$ A(z)=\frac{z}{(1-z)^{a_1}(1-z^2)^{a_2}(1-z^3)^{a_3}\cdots}. \tag{3} $$

This is not an especially nice form for $A(z)$, since it involves an infinite product and the coefficients $a_1,a_2,\ldots$ appear on the right-hand side. A somewhat more aesthetic way to represent $A(z)$ is given in exercise 1; it leads to a reasonably

Verified: no
Solve time: 11m56s


Define the embedding $T \subseteq T'$ for finite ordered trees as follows. Each tree is either a single vertex or a root with an ordered finite sequence of subtrees

$$ T = (T^{(1)},\dots,T^{(m)}). $$

We write $T \subseteq T'$ if there exists an increasing map $f : {1,\dots,m} \to {1,\dots,m'}$ such that for every $i$,

$$ T^{(i)} \subseteq {T'}^{(f(i))}, $$

recursively, and the root is mapped to the root. This is the standard ordered-tree embedding.

We prove: in every infinite sequence $T_1,T_2,\dots$ there exist $j<k$ with $T_j \subseteq T_k$.

Step 1. Induction on number of vertices

Let $P(n)$ be the statement: there is no infinite bad sequence of trees all of which have at most $n$ vertices.

We prove $P(n)$ for all $n$ by induction.

For $n=1$, there is only one tree (a single vertex), so any infinite sequence contains repetitions, hence an embedding holds. So $P(1)$ is true.

Assume $P(m)$ holds for all $m<n$. We prove $P(n)$.

Step 2. Reduce to trees of maximal size $n$

Assume for contradiction that there is an infinite bad sequence $(T_i)$ with all $T_i$ having at most $n$ vertices.

Let $n$ be minimal such that infinitely many terms of the sequence have size $n$. Pass to the subsequence $S_1,S_2,\dots$ consisting of all trees of size exactly $n$. This subsequence is still infinite and bad.

So it suffices to consider an infinite bad sequence where every tree has exactly $n$ vertices.

Step 3. Represent trees as sequences over smaller trees

Each tree $S_i$ has root with ordered children

$$ S_i = (A_{i,1},A_{i,2},\dots,A_{i,m_i}), $$

where each $A_{i,j}$ is a subtree of strictly smaller size than $n$. Hence every $A_{i,j}$ has at most $n-1$ vertices.

Let $\mathcal{T}_{<n}$ be the set of all trees with fewer than $n$ vertices.

By the induction hypothesis $P(n-1)$, the set $\mathcal{T}_{<n}$ is a well-quasi-order under $\subseteq$.

Thus each $S_i$ can be viewed as a finite sequence (a word)

$$ S_i \equiv (A_{i,1},\dots,A_{i,m_i}) $$

over the alphabet $\mathcal{T}_{<n}$, equipped with the wqo $\subseteq$.

Step 4. Apply Higman’s lemma

We use Higman’s lemma:

If a set $X$ is well-quasi-ordered, then the set of finite sequences over $X$ is also well-quasi-ordered under the subsequence embedding order (respecting the order in the sequence).

Apply this to $X = \mathcal{T}_{<n}$. Then the set of finite sequences of trees of size $<n$ is a wqo.

Therefore the sequence $S_1,S_2,\dots$, viewed as words over $X$, cannot be bad. Hence there exist $j<k$ such that $S_j$ embeds into $S_k$ as sequences: there exists a strictly increasing map

$$ f : {1,\dots,m_j} \to {1,\dots,m_k} $$

such that for all $t$,

$$ A_{j,t} \subseteq A_{k,f(t)}. $$

Step 5. Lift sequence embedding to tree embedding

We now verify that this implies tree embedding.

Since each matched pair of subtrees satisfies

$$ A_{j,t} \subseteq A_{k,f(t)}, $$

we can recursively combine these embeddings under the root. The increasing map $f$ preserves the left-to-right order of children, so the root of $S_j$ maps to the root of $S_k$, and each subtree embeds into a corresponding subtree in order.

Thus $S_j \subseteq S_k$ as ordered trees.

This contradicts that the sequence was bad.

Step 6. Completion of induction

We have shown that no bad sequence exists for trees of size $n$, assuming the same for smaller sizes. Therefore $P(n)$ holds for all $n$.

Step 7. Conclusion

For any infinite sequence of finite ordered trees $T_1,T_2,\dots$, choose $n$ large enough so that infinitely many terms have size at most $n$. Applying $P(n)$ to that subsequence yields $j<k$ with

$$ T_j \subseteq T_k. $$

Hence every infinite sequence of finite ordered trees contains an embedded pair. ∎