Kvant Math Problem 657

Let the rows be $R_1,\dots,R_n$, each a vector of length $n$.

Verified: no
Verdicts: SKIP + SKIP
Solve time: 5m26s
Source on kvant.digital

Problem

In the table $n \times n$ filled with numbers, all rows are distinct. Prove that it is possible to delete some column from the table so that all rows in the remaining table are still distinct.

A. V. Anzhan

Exploration

Let the rows be $R_1,\dots,R_n$, each a vector of length $n$. We must show that some column can be removed while preserving pairwise distinctness of all rows.

It is natural to reformulate the problem in terms of pairs of rows. For every pair $(i,j)$ with $i\ne j$, since the rows are distinct, there exists at least one column in which they differ. If column $k$ is deleted, then rows $R_i$ and $R_j$ become equal exactly when they differed only in column $k$ and agreed in all other columns.

Thus a column is bad if there exists a pair of rows distinguished by that column alone.

Consider small cases. For $n=2$, two distinct rows differ in at least one column. If they differ in both columns, either column may be removed. If they differ in exactly one column, removing the other column preserves distinctness. The claim holds.

For $n=3$, suppose every column were bad. Then for each column $k$ there would be a pair of rows that become equal after deleting $k$. Such a pair differs only in column $k$. If three different columns are bad, we obtain three pairs of rows. Since there are only three rows, these pairs cannot all be distinct. This suggests a graph-theoretic interpretation.

Let us create a graph whose vertices are the rows. Connect vertices $i$ and $j$ when the corresponding rows differ in exactly one column. Label the edge by that column. A bad column is precisely a label appearing on some edge.

The crucial point is to show that this graph cannot contain all $n$ column labels. Since there are $n$ columns, it would suffice to prove that the graph has at most $n-1$ edges and that different edges have different labels.

Why should different edges have different labels? Suppose two distinct edges have the same label $k$. Then two different pairs of rows differ only in column $k$. Removing coordinate $k$, all four endpoint rows collapse to the same $(n-1)$-tuple. Since changing only column $k$ can produce at most two distinct rows, the two edges would coincide. Hence each label occurs on at most one edge.

Why at most $n-1$ edges? The graph should be acyclic. Indeed, if we traverse a cycle, each edge corresponds to changing exactly one column. Because labels are distinct on a cycle, some column is changed exactly once along the closed walk, making it impossible to return to the starting row. Hence no cycle exists. Therefore the graph is a forest on $n$ vertices, so it has at most $n-1$ edges.

Then at most $n-1$ labels occur, so at least one column label does not occur. Deleting that column cannot make any two rows equal.

The acyclicity argument is the step most likely to hide an error and must be written carefully.

Problem Understanding

We are given an $n\times n$ table whose rows are pairwise distinct. We must prove that there exists a column such that, after deleting that column, the resulting $(n\times(n-1))$ table still has pairwise distinct rows.

This is a Type B problem. The goal is to prove existence of a suitable column.

The core difficulty is to control all pairs of rows simultaneously. A column is unsuitable if there is some pair of rows that differ only in that column. The problem becomes one of understanding the structure formed by such pairs.

Proof Architecture

Define a graph on the set of rows, joining two rows when they differ in exactly one column, and label the edge by that column.

Prove that two distinct edges cannot have the same label. If two pairs of rows differ only in the same column, the pairs must actually coincide.

Prove that the graph contains no cycle. Along a cycle, every edge label is distinct; traversing the cycle changes each involved column exactly once, so one cannot return to the starting row.

Deduce that the graph is a forest on $n$ vertices and therefore has at most $n-1$ edges.

Since each edge has a unique label, at most $n-1$ column labels appear on edges. Hence some column label appears on no edge.

Show that deleting such a column preserves distinctness of all rows. If two rows became equal after deletion, they would have differed only in that column, producing an edge with that label, contrary to the choice of the column.

The most delicate lemma is the proof that the graph is acyclic.

Solution

Let the rows of the table be $R_1,\dots,R_n$.

Construct a graph $G$ whose vertices are the rows. Two vertices $R_i$ and $R_j$ are joined by an edge if and only if the corresponding rows differ in exactly one column. If they differ only in column $k$, label the edge by $k$.

We first prove that no two distinct edges of $G$ have the same label.

Assume that edges $R_aR_b$ and $R_cR_d$ both have label $k$. Then each pair differs only in column $k$. Consequently, after deleting column $k$, the rows $R_a$ and $R_b$ become equal, and the rows $R_c$ and $R_d$ become equal.

Let $T$ be the common $(n-1)$-tuple obtained from $R_a$ and $R_b$ after deleting column $k$. Since $R_a$ and $R_b$ differ only in column $k$, they are the only two possible rows obtained by inserting a value in the $k$-th position of $T$. The same applies to $R_c$ and $R_d$. Hence the set ${R_a,R_b}$ coincides with the set ${R_c,R_d}$, and the two edges are identical. Thus distinct edges cannot have the same label.

Next we prove that $G$ contains no cycle.

Suppose, to the contrary, that

$$R_{i_1}R_{i_2},; R_{i_2}R_{i_3},; \dots,; R_{i_m}R_{i_1}$$

is a cycle. Since adjacent edges in a cycle are distinct, and distinct edges have distinct labels, all labels occurring on the cycle are different.

Traversing an edge labeled $k$ changes only the entry in column $k$. Starting from $R_{i_1}$ and moving around the cycle, each column whose label occurs on the cycle is changed exactly once, because the labels are all distinct. All other columns remain unchanged.

After completing the cycle, every column corresponding to a cycle label has been altered once. Hence the final row differs from the initial row in at least one column. This is impossible because the walk ends at the starting vertex $R_{i_1}$.

The contradiction shows that $G$ is acyclic. Therefore $G$ is a forest with $n$ vertices, so it has at most $n-1$ edges.

Each edge has a unique label, and there are at most $n-1$ edges. Hence at most $n-1$ column labels occur on edges. Since there are $n$ columns, there exists a column, say column $s$, that is not the label of any edge.

We claim that deleting column $s$ leaves all rows distinct.

Assume the contrary. Then there exist two rows $R_i$ and $R_j$ that become equal after column $s$ is deleted. This means that $R_i$ and $R_j$ agree in every column except possibly column $s$. Because the original rows are distinct, they must differ in column $s$. Thus they differ in exactly one column, namely column $s$.

By the definition of $G$, the vertices $R_i$ and $R_j$ are connected by an edge labeled $s$. This contradicts the choice of $s$.

Therefore deleting column $s$ preserves the distinctness of all rows.

This completes the proof.

Verification of Key Steps

The first delicate step is the assertion that distinct edges cannot share a label. Suppose rows $A$ and $B$ differ only in column $k$, and rows $C$ and $D$ also differ only in column $k$. Deleting column $k$ produces the same $(n-1)$-tuple from each pair. Once that $(n-1)$-tuple is fixed, varying only the $k$-th coordinate can produce at most two rows, namely the two rows of the pair. Hence ${A,B}={C,D}$. A careless argument might overlook the need to show that both pairs collapse to the same reduced row.

The second delicate step is the proof that a cycle cannot exist. Along a cycle, different edges have different labels. Traversing an edge labeled $k$ changes exactly column $k$. Since each label appears once on the cycle, each involved column is changed exactly once. Returning to the starting row would require every column to be changed an even number of times. The contradiction arises because at least one column is changed once. An incorrect proof might merely say that “something changes” without tracking parity of changes.

The final step is the choice of a label not appearing on any edge. If deleting column $s$ caused two rows to coincide, then those rows would differ only in column $s$. This is precisely the definition of an edge labeled $s$. The contradiction uses the graph construction directly; no additional property of the forest is needed.

Alternative Approaches

One can avoid graph terminology and work directly with pairs of rows. Call a column critical if there exists a pair of rows that differ only in that column. It suffices to prove that not all columns are critical.

For each critical column $k$, choose one pair of rows that differs only in column $k$. Distinct critical columns yield distinct chosen pairs, because a pair of rows cannot differ only in two different columns. One then proves that these pairs form an acyclic graph on the set of rows by exactly the same parity argument used above. Since an acyclic graph on $n$ vertices has at most $n-1$ edges, there are at most $n-1$ critical columns. Hence some column is noncritical, and deleting it preserves distinctness.

The graph formulation is preferable because it packages the combinatorial structure into a standard object, making the counting argument immediate once acyclicity is established.