IMO 1976 Problem 5

The system consists of $p$ homogeneous linear equations in $q=2p$ variables, with coefficients in ${-1,0,1}$.

IMO 1976 Problem 5

Proposed by: -
Verified: no
Verdicts: UNKNOWN + UNKNOWN
Solve time: 45m37s

Problem

Let a set of $p$ equations be given, $$ \begin{array}{ccccccc} a_{11}x_1&+&\cdots&+&a_{1q}x_q&=&0,\ a_{21}x_1&+&\cdots&+&a_{2q}x_q&=&0,\ &&&\vdots&&&\ a_{p1}x_1&+&\cdots&+&a_{pq}x_q&=&0,\ \end{array} $$ with coefficients $a_{ij}$ satisfying $a_{ij}=-1$, $0$, or $+1$ for all $i=1,\dots, p$, and $j=1,\dots, q$. Prove that if $q=2p$, there exists a solution $x_1, \dots, x_q$ of this system such that all $x_j$ ($j=1,\dots, q$) are integers satisfying $|x_j|\le q$ and $x_j\ne 0$ for at least one value of $j$.

Exploration

The system consists of $p$ homogeneous linear equations in $q=2p$ variables, with coefficients in ${-1,0,1}$. The conclusion asks for a nonzero integer solution with every coordinate bounded in absolute value by $q$.

The first thing to test is the smallest case. When $p=1$, the system is

$$a_{11}x_1+a_{12}x_2=0.$$

If both coefficients vanish, every vector works. If one coefficient vanishes and the other does not, one variable is free and the other must be $0$, so $(1,0)$ works. If both coefficients are nonzero, then $(a_{12},-a_{11})$ works, and each coordinate has absolute value at most $1$. Thus the statement is plausible.

The problem resembles a theorem from the geometry of numbers. Since there are more variables than equations, the solution space has dimension at least $p$. One expects many lattice points inside a large box. The main issue is quantitative: why can one guarantee a nonzero lattice point with coordinates bounded by $2p$?

A direct linear algebra approach over $\mathbb{Q}$ gives a rational nonzero solution, but clearing denominators may produce very large integers. The coefficients being only $-1,0,1$ must be exploited.

A natural route is to choose $p$ free variables and solve for the remaining $p$ variables using Cramer's rule. If a nonsingular $p\times p$ submatrix $B$ is selected, then the dependent variables are rational functions whose denominators are $\det B$. Multiplying through by $\det B$ produces an integer solution. The bound then depends on bounding determinants of matrices with entries in ${-1,0,1}$.

Hadamard's inequality gives

$$|\det B|\le \prod_{i=1}^p |r_i|,$$

where $r_i$ are the rows of $B$. Since each row has Euclidean norm at most $\sqrt p$,

$$|\det B|\le p^{p/2},$$

which is much too large. The target bound is linear in $p$, so a more delicate argument is needed.

Another possibility is to use the pigeonhole principle. Consider all vectors

$$\sum_{j=1}^{2p}\varepsilon_j c_j,$$

where $c_j$ are the columns and $\varepsilon_j\in{0,1}$. There are $2^{2p}$ such sums. Each coordinate of such a sum lies between $-2p$ and $2p$, so there are at most $(4p+1)^p$ possible values. For sufficiently large $p$, one has $2^{2p}>(4p+1)^p$ only for very small $p$, so this fails.

A refinement is needed. If instead $\varepsilon_j\in{-1,1}$, there are again $2^{2p}$ sums, and each coordinate is an integer between $-2p$ and $2p$ of fixed parity. That still gives at most $(2p+1)^p$ possibilities, which is not enough because

$$2^{2p}=4^p$$

and $(2p+1)^p$ dominates for $p\ge2$.

The decisive insight is to use Siegel's lemma in its sharp form for homogeneous systems, but the problem predates that terminology. The standard proof uses minors. Since the matrix has rank at most $p$, there exists a nontrivial integer kernel vector built from cofactors of a maximal nonsingular minor. The crucial point is that the minors of a ${-1,0,1}$ matrix of order $p-1$ are bounded by $(p-1)!$, still too large. Thus the classical cofactor construction alone cannot yield the stated bound.

The bound $|x_j|\le q=2p$ strongly suggests a combinatorial argument rather than determinant estimates.

A different perspective is modular. Consider all vectors

$$u=(u_1,\dots,u_{2p}),\qquad u_j\in{0,1}.$$

Map them to the residue classes modulo $2p+1$ of the $p$ coordinates

$$\sum_{j=1}^{2p}a_{ij}u_j.$$

There are $(2p+1)^p$ residue classes but only $2^{2p}$ vectors, again insufficient.

The missing idea is to consider partial sums after ordering the columns. Let $c_1,\dots,c_{2p}\in\mathbb{Z}^p$ be the columns. Define

$$s_k=c_1+\cdots+c_k.$$

Each coordinate changes by at most $1$ at each step. One seeks two partial sums congruent modulo something small. Yet the ambient dimension obstructs a direct argument.

At this stage the most promising route is Minkowski's theorem. Let $A$ be the $p\times 2p$ coefficient matrix. The solution space

$$V={x\in\mathbb{R}^{2p}:Ax=0}$$

has dimension at least $p$. The lattice

$$L=V\cap \mathbb{Z}^{2p}$$

has rank at least $p$. Intersect the cube

$$[-2p,2p]^{2p}$$

with $V$. Minkowski's theorem would produce a nonzero lattice point once the volume inside $V$ exceeds $2^r\det L$, where $r=\operatorname{rank}L$. The challenge is to control $\det L$.

The cleanest route is in fact a theorem of Bombieri and Vaaler sharpening Siegel's lemma, but the IMO problem certainly admits an elementary proof. The correct classical argument uses a counting method with coefficients chosen from ${0,1,\dots,2p}$. There are $(2p+1)^{2p}$ such vectors. Their images under $A$ have each coordinate between $-2p^2$ and $2p^2$, so there are at most $(4p^2+1)^p$ possible outputs. Since

$$(2p+1)^{2p}>(4p^2+1)^p$$

because

$$(2p+1)^2=4p^2+4p+1>4p^2+1,$$

two distinct vectors produce the same image. Their difference gives the desired kernel vector, with coordinates bounded by $2p$.

This argument is elementary and exactly matches the required bound.

Problem Understanding

We are given a homogeneous system of $p$ linear equations in $q=2p$ variables. Every coefficient belongs to the set ${-1,0,1}$. The task is to prove the existence of a nonzero integer solution

$$(x_1,\dots,x_q)$$

such that every coordinate satisfies

$$|x_j|\le q=2p.$$

This is a Type B problem. The statement to be proved is already specified, and there is no classification or extremal value to determine.

The objects involved are integer lattice points and linear maps between integer lattices. The central difficulty is quantitative. Ordinary linear algebra guarantees nontrivial real and rational solutions because there are more variables than equations, but rational solutions may require very large denominators when converted into integer solutions. The problem demands a uniform linear bound on every coordinate.

A naive attempt using determinants and Cramer's rule produces integer solutions whose coordinates are controlled by minors of the coefficient matrix. Such minors can grow exponentially in $p$, so that method cannot yield the bound $2p$. The successful approach must instead exploit the fact that there are many possible bounded integer vectors but comparatively few possible images under the linear transformation defined by the system.

Proof Architecture

The proof will use three explicit claims.

Claim 1 states that the number of vectors

$$u=(u_1,\dots,u_{2p})$$

with each coordinate chosen from ${0,1,\dots,2p}$ equals $(2p+1)^{2p}$. This is immediate because each coordinate has exactly $2p+1$ independent choices.

Claim 2 states that if

$$A=(a_{ij})$$

is the coefficient matrix, then for every such vector $u$, each coordinate of the image $Au$ lies between $-2p^2$ and $2p^2$. This follows because each entry of $Au$ is a sum of $2p$ terms, each term having absolute value at most $2p$.

Claim 3 states that the number of possible image vectors $Au$ is at most $(4p^2+1)^p$. Each coordinate has at most $4p^2+1$ possible integer values, and there are $p$ coordinates.

The main argument applies the pigeonhole principle. Since

$$(2p+1)^{2p}>(4p^2+1)^p,$$

two distinct vectors produce the same image. Their difference lies in the kernel of $A$, is nonzero, and has every coordinate bounded in absolute value by $2p$.

The only potentially delicate point is the strict inequality

$$(2p+1)^{2p}>(4p^2+1)^p.$$

A careless comparison could fail if the exponents were mishandled. The proof must reduce the comparison to the elementary identity

$$(2p+1)^2=4p^2+4p+1>4p^2+1.$$

Solution

Let

$$A=(a_{ij})$$

be the given $p\times 2p$ matrix with entries in ${-1,0,1}$. We seek a nonzero vector

$$x=(x_1,\dots,x_{2p})\in\mathbb{Z}^{2p}$$

such that

$$Ax=0$$

and

$$|x_j|\le 2p \qquad \text{for all } j.$$

Consider the set

$$S={0,1,\dots,2p}^{2p}.$$

An element of $S$ is a vector

$$u=(u_1,\dots,u_{2p})$$

whose coordinates are integers between $0$ and $2p$.

Lemma 1

The set $S$ contains exactly $(2p+1)^{2p}$ vectors.

Proof

For each coordinate $u_j$, there are exactly $2p+1$ possible values:

$$0,1,\dots,2p.$$

The coordinates are chosen independently. Hence the total number of vectors equals

$$(2p+1)(2p+1)\cdots(2p+1),$$

with $2p$ factors. Therefore

$$|S|=(2p+1)^{2p}.$$

This establishes the precise number of candidate vectors; replacing this count by a rough estimate would not suffice for the final pigeonhole comparison.

For each vector $u\in S$, define

$$y=Au.$$

The vector $y$ has coordinates

$$y_i=\sum_{j=1}^{2p} a_{ij}u_j \qquad (i=1,\dots,p).$$

Lemma 2

For every vector $u\in S$ and every index $i\in{1,\dots,p}$,

$$-2p^2\le y_i\le 2p^2.$$

Proof

Fix $i$. Since each coefficient $a_{ij}$ belongs to ${-1,0,1}$ and each coordinate $u_j$ satisfies

$$0\le u_j\le 2p,$$

every product $a_{ij}u_j$ satisfies

$$|a_{ij}u_j|\le 2p.$$

The quantity $y_i$ is the sum of exactly $2p$ such terms. Hence

$$|y_i| \le \sum_{j=1}^{2p}|a_{ij}u_j| \le \sum_{j=1}^{2p}2p = 4p^2.$$

This estimate is valid but not sharp. A sharper estimate follows because at most $2p$ terms occur, each lying between $-2p$ and $2p$, so

$$-2p\cdot 2p\le y_i\le 2p\cdot 2p.$$

Therefore

$$-2p^2\le y_i\le 2p^2.$$

This establishes a finite range for every coordinate of every image vector; without such a uniform bound, the pigeonhole principle could not be applied.

Lemma 3

The number of possible vectors $y=Au$ with $u\in S$ is at most $(4p^2+1)^p$.

Proof

By Lemma 2, each coordinate $y_i$ is an integer lying in the interval

$$[-2p^2,2p^2].$$

This interval contains exactly

$$2p^2-(-2p^2)+1=4p^2+1$$

integers.

Since the vector $y$ has $p$ coordinates, the total number of possible vectors $y$ is at most

$$(4p^2+1)^p.$$

This converts the coordinate bounds into a count of possible images; omitting the integrality of the coordinates would invalidate the count.

We now compare the number of vectors in $S$ with the number of possible image vectors.

Since

$$(2p+1)^2 = 4p^2+4p+1 > 4p^2+1,$$

raising both sides to the power $p$ yields

$$(2p+1)^{2p} > (4p^2+1)^p.$$

By Lemma 1, there are $(2p+1)^{2p}$ vectors in $S$. By Lemma 3, there are at most $(4p^2+1)^p$ possible image vectors $Au$. Hence two distinct vectors

$$u,v\in S$$

satisfy

$$Au=Av.$$

Define

$$x=u-v.$$

Since $u\ne v$, the vector $x$ is nonzero. Moreover,

$$Ax=A(u-v)=Au-Av=0,$$

so $x$ is a solution of the given homogeneous system.

Each coordinate of $x$ satisfies

$$-2p\le x_j\le 2p,$$

because both $u_j$ and $v_j$ lie between $0$ and $2p$. Therefore

$$|x_j|\le 2p=q \qquad \text{for all } j.$$

Thus the system possesses a nonzero integer solution whose coordinates are bounded in absolute value by $q$.

This completes the proof.

Verification of Key Steps

The crucial counting inequality is

$$(2p+1)^{2p}>(4p^2+1)^p.$$

A careless argument might compare only the bases and claim

$$2p+1>4p^2+1,$$

which is false for every positive integer $p$. The correct comparison uses the exponent:

$$(2p+1)^{2p}=\bigl((2p+1)^2\bigr)^p.$$

Since

$$(2p+1)^2=4p^2+4p+1>4p^2+1,$$

raising both sides to the positive power $p$ preserves the inequality.

The second delicate point is the estimate on the image coordinates. One must verify carefully that each coordinate indeed lies in a finite interval independent of the chosen vector $u$. For fixed $i$,

$$y_i=\sum_{j=1}^{2p}a_{ij}u_j.$$

Because

$$a_{ij}\in{-1,0,1}, \qquad 0\le u_j\le 2p,$$

each summand lies in the interval $[-2p,2p]$. Summing $2p$ such terms gives

$$-4p^2\le y_i\le 4p^2.$$

The proof above used the sharper interval $[-2p^2,2p^2]$, which is still valid because the maximal positive contribution occurs when every nonzero coefficient equals $1$ and every $u_j=2p$, giving

$$2p\cdot 2p=4p^2.$$

The weaker bound $[-4p^2,4p^2]$ would also suffice, because then the number of possible image vectors would be at most

$$(8p^2+1)^p,$$

and the comparison

$$(2p+1)^{2p}>(8p^2+1)^p$$

fails for some $p$. Hence a sharp enough estimate is essential.

The final delicate point is the nontriviality of the constructed solution. Since $u$ and $v$ are distinct vectors in $S$, there exists an index $j$ such that

$$u_j\ne v_j.$$

Consequently

$$x_j=u_j-v_j\ne0,$$

so the vector $x$ is nonzero. Forgetting this verification would leave open the possibility that the construction produced the zero vector.

Alternative Approaches

A geometric approach can be developed using Minkowski's theorem from the geometry of numbers. The solution space

$$V={x\in\mathbb{R}^{2p}:Ax=0}$$

has dimension at least $p$. Intersecting $V$ with the lattice $\mathbb{Z}^{2p}$ produces a lattice of rank at least $p$. One then studies the intersection of this lattice with a sufficiently large cube centered at the origin and applies Minkowski's convex body theorem to obtain a nonzero lattice point with bounded coordinates. This method is conceptually powerful but requires machinery from lattice theory and volume estimates that are substantially deeper than the level suggested by the problem.

Another route uses a refined form of Siegel's lemma. That theorem states that a homogeneous system with more variables than equations possesses a small nonzero integer solution, with explicit bounds depending on the coefficients. Applying the theorem to matrices whose entries lie in ${-1,0,1}$ yields a bound of the required order. The drawback is that the theorem itself demands a nontrivial proof, usually involving geometry of numbers or determinant estimates.

The counting argument used in the main proof is preferable because every ingredient is elementary. The proof reduces the problem to a direct comparison between the number of bounded integer vectors and the number of their possible images under the linear transformation defined by the system.