8.2 Equivalent Formulations

The axiom of choice has many equivalent forms. Some of these forms speak about selecting elements from sets, some speak about ordering arbitrary sets, and some speak about finding maximal objects in partially ordered sets. Although these statements look different, they express the same underlying principle over the usual axioms of ZF set theory.

The purpose of this section is to explain the most important equivalent formulations in detail. The main examples are the product form of choice, the statement that every surjection has a right inverse, the well ordering theorem, Zorn’s lemma, and the Hausdorff maximal principle.

The Product Form of Choice

The axiom of choice says that every family of nonempty sets has a choice function. This can be restated in terms of products.

Let $(A_i)_{i \in I}$ be an indexed family of sets. The product of this family is:

\prod_{i \in I} A_i = \{f : I \to \bigcup_{i \in I} A_i : f(i) \in A_i \text{ for every } i \in I\}

An element of this product is exactly a choice function. It chooses one element $f(i)$ from each set $A_i$ .

Proposition 8.21 (Product Form)

The axiom of choice is equivalent to the statement that whenever $(A_i)_{i \in I}$ is a family of nonempty sets, the product:

\prod_{i \in I} A_i

is nonempty.

Proof

Assume the axiom of choice. Let $(A_i)_{i \in I}$ be a family of nonempty sets. By the axiom of choice, there exists a function $f$ such that:

f(i) \in A_i

for every $i \in I$ . By the definition of the product, this function $f$ is an element of:

\prod_{i \in I} A_i

Therefore the product is nonempty.

Conversely, assume that every product of nonempty sets is nonempty. Let $(A_i)_{i \in I}$ be any family of nonempty sets. By assumption:

\prod_{i \in I} A_i \ne \varnothing

so there exists:

f \in \prod_{i \in I} A_i

By the definition of product, this means:

f(i) \in A_i

for every $i \in I$ . Thus $f$ is a choice function for the family, and the axiom of choice follows.

This formulation is often the most compact way to state choice. It says that an arbitrary product of nonempty sets is nonempty.

Right Inverses of Surjections

Another useful form of choice is stated in terms of surjective functions. A surjective function sends at least one element of its domain to every element of its codomain. A right inverse chooses one preimage for each point in the codomain.

Definition 8.22 (Right Inverse)

Let:

f : X \to Y

be a function. A right inverse of $f$ is a function:

g : Y \to X

such that:

f(g(y)) = y

for every $y \in Y$ .

Equivalently:

f \circ g = \mathrm{id}_Y

The function $g$ chooses, for each $y \in Y$ , one element $x \in X$ such that $f(x)=y$ .

Proposition 8.23

The axiom of choice is equivalent to the statement that every surjective function has a right inverse.

Proof

Assume the axiom of choice. Let:

f : X \to Y

be a surjective function.

For each $y \in Y$ , define the fiber over $y$ :

X_y = \{x \in X : f(x)=y\}

Since $f$ is surjective, each fiber is nonempty:

X_y \ne \varnothing

for every $y \in Y$ .

The family:

(X_y)_{y \in Y}

is therefore a family of nonempty sets. By the axiom of choice, there exists a choice function:

g : Y \to \bigcup_{y \in Y} X_y

such that:

g(y) \in X_y

for every $y \in Y$ .

Since $g(y) \in X_y$ , the definition of $X_y$ gives:

f(g(y))=y

for every $y \in Y$ . Hence:

f \circ g = \mathrm{id}_Y

and $g$ is a right inverse of $f$ .

Conversely, assume every surjective function has a right inverse. Let:

(A_i)_{i \in I}

be a family of nonempty sets. We must construct a choice function.

Define:

X = \{(i,a) : i \in I \text{ and } a \in A_i\}

Define:

\pi : X \to I

by:

\pi(i,a)=i

We show that $\pi$ is surjective. Let $i \in I$ . Since $A_i$ is nonempty, choose an element:

a \in A_i

Then:

(i,a) \in X

and:

\pi(i,a)=i

Therefore every $i \in I$ lies in the image of $\pi$ , so $\pi$ is surjective.

By assumption, $\pi$ has a right inverse:

s : I \to X

such that:

\pi(s(i))=i

for every $i \in I$ .

Since $s(i) \in X$ , there exist $j \in I$ and $a \in A_j$ such that:

s(i)=(j,a)

But:

\pi(s(i))=i

and:

\pi(j,a)=j

Thus:

j=i

Therefore, for each $i \in I$ , the element $s(i)$ has the form:

s(i)=(i,a_i)

for some:

a_i \in A_i

Define:

c(i)=a_i

Then:

c(i) \in A_i

for every $i \in I$ , so $c$ is a choice function. This proves the axiom of choice.

Well Ordered Sets

The well ordering theorem is another equivalent form of choice. It says that every set can be ordered so strongly that every nonempty subset has a first element.

Definition 8.24 (Total Order)

A total order on a set $X$ is a relation $\le$ such that, for all $x,y,z \in X$ :

$x \le x$ .
If $x \le y$ and $y \le x$ , then $x=y$ .
If $x \le y$ and $y \le z$ , then $x \le z$ .
Either $x \le y$ or $y \le x$ .

The first three conditions say that $\le$ is a partial order. The fourth condition says that any two elements are comparable.

Definition 8.25 (Well Order)

A well order on a set $X$ is a total order $\le$ such that every nonempty subset of $X$ has a least element.

That means that if:

Y \subseteq X

and:

Y \ne \varnothing

then there exists $y_0 \in Y$ such that:

y_0 \le y

for every $y \in Y$ .

The usual order on $\mathbb{N}$ is a well order. Every nonempty subset of $\mathbb{N}$ has a least element.

The usual order on $\mathbb{Z}$ is not a well order, because $\mathbb{Z}$ itself has no least element.

The usual order on $\mathbb{R}$ is not a well order, because the interval:

(0,1)

has no least element.

Theorem 8.26 (Well Ordering Theorem)

Every set can be well ordered.

This means that for every set $X$ , there exists some relation $\le$ on $X$ such that $(X,\le)$ is a well ordered set.

The theorem does not say that the well order is natural. It only says that some well order exists.

Proposition 8.27

The axiom of choice implies the well ordering theorem.

Proof

Assume the axiom of choice. Let $X$ be a set. If:

X=\varnothing

then the empty relation is a well order on $X$ , so suppose that $X$ is nonempty.

By the axiom of choice, there exists a choice function on the family of all nonempty subsets of $X$ . Thus there is a function:

c : \mathcal{P}(X) \setminus \{\varnothing\} \to X

such that:

c(A) \in A

for every nonempty subset $A \subseteq X$ .

We now build an enumeration of the elements of $X$ by transfinite recursion. At each stage, if there are elements not yet chosen, we choose one of them.

Suppose elements $x_\beta$ have already been chosen for all $\beta < \alpha$ . Let:

R_\alpha = X \setminus \{x_\beta : \beta < \alpha\}

If:

R_\alpha \ne \varnothing

define:

x_\alpha = c(R_\alpha)

Thus $x_\alpha$ is chosen from the remaining elements of $X$ .

This construction cannot continue through all ordinals while producing distinct elements of $X$ , because $X$ is a set and the class of all ordinals is not a set. Hence there is some ordinal $\gamma$ such that:

R_\gamma = \varnothing

Then:

X = \{x_\alpha : \alpha < \gamma\}

Define an order on $X$ by:

x_\alpha < x_\beta

if and only if:

\alpha < \beta

This order is total because the ordinals below $\gamma$ are totally ordered.

It is a well order because every nonempty subset $Y \subseteq X$ corresponds to a nonempty set of ordinals:

S_Y = \{\alpha < \gamma : x_\alpha \in Y\}

Every nonempty set of ordinals has a least element. Let:

\alpha_0 = \min S_Y

Then $x_{\alpha_0}$ is the least element of $Y$ under the order just defined.

Therefore $X$ can be well ordered.

Proposition 8.28

The well ordering theorem implies the axiom of choice.

Proof

Assume every set can be well ordered. Let:

(A_i)_{i \in I}

be a family of nonempty sets. We must construct a choice function.

Let:

U = \bigcup_{i \in I} A_i

By the well ordering theorem, there exists a well order $\le$ on $U$ .

For each $i \in I$ , the set $A_i$ is a nonempty subset of $U$ . Since $\le$ is a well order, $A_i$ has a least element.

Define:

f(i)=\text{the least element of } A_i \text{ with respect to } \le

Then:

f(i) \in A_i

for every $i \in I$ .

Thus $f$ is a choice function for the family, so the axiom of choice holds.

Corollary 8.29

The axiom of choice is equivalent to the well ordering theorem.

Proof

Proposition 8.27 proves that the axiom of choice implies the well ordering theorem. Proposition 8.28 proves that the well ordering theorem implies the axiom of choice. Therefore the two statements are equivalent.

Partially Ordered Sets

Zorn’s lemma is stated using partially ordered sets. Before stating it, we recall the relevant definitions.

Definition 8.30 (Partial Order)

A partial order on a set $P$ is a relation $\le$ satisfying reflexivity, antisymmetry, and transitivity.

Thus, for all $x,y,z \in P$ :

$x \le x$ .
If $x \le y$ and $y \le x$ , then $x=y$ .
If $x \le y$ and $y \le z$ , then $x \le z$ .

A set equipped with a partial order is called a partially ordered set, or poset.

Unlike a total order, a partial order does not require every two elements to be comparable.

Definition 8.31 (Chain)

Let $(P,\le)$ be a partially ordered set. A chain in $P$ is a subset $C \subseteq P$ such that for all $x,y \in C$ , either:

x \le y

or:

y \le x

Thus a chain is a subset on which the partial order becomes a total order.

Definition 8.32 (Upper Bound)

Let $(P,\le)$ be a partially ordered set, and let $C \subseteq P$ . An upper bound for $C$ is an element $u \in P$ such that:

x \le u

for every $x \in C$ .

The upper bound does not need to belong to $C$ . It only needs to lie above every element of $C$ .

Definition 8.33 (Maximal Element)

Let $(P,\le)$ be a partially ordered set. An element $m \in P$ is maximal if:

m \le x

implies:

x=m

for every $x \in P$ .

Equivalently, there is no element $x \in P$ such that:

m < x

A maximal element is not necessarily above every element of $P$ . It simply cannot be extended upward inside the order.

Lemma 8.34 (Greatest Elements and Maximal Elements)

Every greatest element is maximal, but a maximal element need not be greatest.

Proof

Suppose $g$ is a greatest element of $P$ . Then:

x \le g

for every $x \in P$ .

If:

g \le y

then since $g$ is greatest, we also have:

y \le g

By antisymmetry:

y=g

Therefore $g$ is maximal.

For the converse, consider the set:

P=\{a,b\}

with the partial order given only by:

a \le a

and:

b \le b

Then both $a$ and $b$ are maximal, because neither has a strictly larger element above it. However, there is no greatest element, since $a$ and $b$ are incomparable.

Zorn’s Lemma

Zorn’s lemma is one of the most useful equivalent forms of the axiom of choice. It is used to prove the existence of maximal objects.

Theorem 8.35 (Zorn’s Lemma)

Let $(P,\le)$ be a nonempty partially ordered set. Suppose that every chain in $P$ has an upper bound in $P$ . Then $P$ has a maximal element.

The hypothesis says that every totally ordered subcollection can be extended upward to some element of the poset. The conclusion says that there exists an object that cannot be enlarged further.

Proposition 8.36

The well ordering theorem implies Zorn’s lemma.

Proof

Assume the well ordering theorem. Let $(P,\le)$ be a nonempty partially ordered set in which every chain has an upper bound.

By the well ordering theorem, choose a well order $\preceq$ of the underlying set $P$ . This well order is not required to agree with the partial order $\le$ . It is used only to make canonical choices.

We construct a chain $C \subseteq P$ by adding elements whenever possible.

Given a chain $C$ , say that an element $x \in P$ is an extension of $C$ if:

C \cup \{x\}

is still a chain and $x$ is not already in $C$ .

If extensions exist, choose the $\preceq$ least extension and add it. If no extensions exist, stop.

More formally, one performs this construction by transfinite recursion, using the well order $\preceq$ to choose the least possible next element whenever one exists.

The construction must eventually stop, because it cannot add distinct elements of the set $P$ through all ordinals. Let $C$ be the chain obtained when no further extension is possible.

We claim that $C$ has an upper bound $u \in P$ . This follows from the hypothesis, since every chain in $P$ has an upper bound.

We now show that $u$ is maximal. Suppose:

u < v

for some $v \in P$ .

Since $u$ is an upper bound for $C$ , every $x \in C$ satisfies:

x \le u

Together with:

u < v

this gives:

x \le v

for every $x \in C$ .

Thus:

C \cup \{v\}

is a chain, and $v$ is an extension of $C$ unless $v \in C$ .

But if $v \in C$ , then because $u$ is an upper bound for $C$ , we have:

v \le u

which contradicts:

u < v

Therefore $v$ is a genuine extension of $C$ , contradicting the fact that the construction stopped.

Hence no such $v$ exists, and $u$ is maximal.

Proposition 8.37

Zorn’s lemma implies the axiom of choice.

Proof

Assume Zorn’s lemma. Let:

(A_i)_{i \in I}

be a family of nonempty sets. We must construct a choice function.

Let $P$ be the set of all partial choice functions. An element of $P$ is a function $f$ such that:

$\operatorname{dom}(f) \subseteq I$ .
$f(i) \in A_i$ for every $i \in \operatorname{dom}(f)$ .

Order $P$ by extension:

f \le g

if and only if:

\operatorname{dom}(f) \subseteq \operatorname{dom}(g)

and:

g(i)=f(i)

for every $i \in \operatorname{dom}(f)$ .

This means that $g$ extends $f$ .

The set $P$ is nonempty, since the empty function belongs to $P$ .

Let $C$ be a chain in $P$ . Define:

h = \bigcup C

We show that $h$ is a partial choice function. Since $C$ is linearly ordered by extension, any two functions in $C$ agree on their common domain. Therefore the union $h$ is a function.

If $i \in \operatorname{dom}(h)$ , then $i \in \operatorname{dom}(f)$ for some $f \in C$ . Since $f$ is a partial choice function:

h(i)=f(i) \in A_i

Thus $h \in P$ .

Also, every $f \in C$ is extended by $h$ , so $h$ is an upper bound for $C$ in $P$ .

By Zorn’s lemma, $P$ has a maximal element $m$ .

We claim that:

\operatorname{dom}(m)=I

Suppose not. Then there exists:

i_0 \in I \setminus \operatorname{dom}(m)

Since $A_{i_0}$ is nonempty, choose:

a_0 \in A_{i_0}

Define:

m' = m \cup \{(i_0,a_0)\}

Then $m'$ is a partial choice function extending $m$ , and it is strictly larger than $m$ . This contradicts the maximality of $m$ .

Therefore:

\operatorname{dom}(m)=I

Hence $m$ is a choice function for the whole family. This proves the axiom of choice.

Corollary 8.38

The axiom of choice, the well ordering theorem, and Zorn’s lemma are equivalent.

Proof

The axiom of choice implies the well ordering theorem by Proposition 8.27. The well ordering theorem implies Zorn’s lemma by Proposition 8.36. Zorn’s lemma implies the axiom of choice by Proposition 8.37. Therefore all three statements are equivalent.

Hausdorff Maximal Principle

The Hausdorff maximal principle is another maximal principle equivalent to the axiom of choice. It says that every partially ordered set contains a maximal chain.

Theorem 8.39 (Hausdorff Maximal Principle)

Every partially ordered set has a maximal chain.

A maximal chain is a chain that cannot be enlarged by adding more elements while remaining a chain.

Proposition 8.40

Zorn’s lemma implies the Hausdorff maximal principle.

Proof

Let $(P,\le)$ be a partially ordered set. Let $\mathcal{C}$ be the set of all chains in $P$ , ordered by inclusion.

Thus:

C \le D

if and only if:

C \subseteq D

We apply Zorn’s lemma to $\mathcal{C}$ .

Let $\mathcal{D}$ be a chain in $\mathcal{C}$ . This means that $\mathcal{D}$ is a collection of chains in $P$ , and any two members of $\mathcal{D}$ are comparable by inclusion.

Define:

U = \bigcup_{C \in \mathcal{D}} C

We show that $U$ is a chain in $P$ .

Let $x,y \in U$ . Then there exist chains $C,D \in \mathcal{D}$ such that:

x \in C

and:

y \in D

Since $\mathcal{D}$ is a chain under inclusion, either:

C \subseteq D

or:

D \subseteq C

If $C \subseteq D$ , then both $x$ and $y$ are in $D$ , and since $D$ is a chain, $x$ and $y$ are comparable.

If $D \subseteq C$ , the same argument applies using $C$ .

Thus $U$ is a chain in $P$ .

Also:

C \subseteq U

for every $C \in \mathcal{D}$ , so $U$ is an upper bound for $\mathcal{D}$ in $\mathcal{C}$ .

By Zorn’s lemma, $\mathcal{C}$ has a maximal element. Such an element is exactly a maximal chain in $P$ .

Proposition 8.41

The Hausdorff maximal principle implies Zorn’s lemma.

Proof

Assume the Hausdorff maximal principle. Let $(P,\le)$ be a nonempty partially ordered set in which every chain has an upper bound.

By the Hausdorff maximal principle, there exists a maximal chain:

C \subseteq P

By hypothesis, this chain has an upper bound:

u \in P

Thus:

x \le u

for every $x \in C$ .

We claim that $u$ is maximal in $P$ .

Suppose not. Then there exists $v \in P$ such that:

u < v

For every $x \in C$ , we have:

x \le u < v

Hence:

x \le v

for every $x \in C$ .

Therefore:

C \cup \{v\}

is a chain.

Moreover, $v \notin C$ . If $v \in C$ , then since $u$ is an upper bound of $C$ , we would have:

v \le u

which contradicts:

u < v

Thus $C \cup \{v\}$ is a strictly larger chain than $C$ , contradicting the maximality of $C$ .

Therefore no such $v$ exists, and $u$ is maximal. Hence Zorn’s lemma holds.

Corollary 8.42

Zorn’s lemma and the Hausdorff maximal principle are equivalent.

Proof

Proposition 8.40 proves that Zorn’s lemma implies the Hausdorff maximal principle. Proposition 8.41 proves that the Hausdorff maximal principle implies Zorn’s lemma. Therefore the two principles are equivalent.

Maximal Objects in Algebra

Zorn’s lemma is widely used because many mathematical objects are naturally ordered by inclusion.

For example, let $V$ be a vector space, and consider the set of all linearly independent subsets of $V$ , ordered by inclusion. A maximal element in this poset is a linearly independent set that cannot be enlarged while preserving linear independence. Such a set is a basis.

The use of Zorn’s lemma follows a common pattern:

Define a poset of partial objects.
Show that every chain has an upper bound.
Apply Zorn’s lemma to obtain a maximal object.
Prove that the maximal object has the desired stronger property.

This pattern appears throughout algebra, topology, and analysis.

Proposition 8.43 (Every Vector Space Has a Basis)

Assume Zorn’s lemma. Every vector space has a basis.

Proof

Let $V$ be a vector space over a field $F$ . Let $P$ be the set of all linearly independent subsets of $V$ , ordered by inclusion.

The set $P$ is nonempty because:

\varnothing \in P

Let $C$ be a chain in $P$ . Define:

U = \bigcup_{S \in C} S

We show that $U$ is linearly independent.

Let:

u_1,\dots,u_n \in U

and suppose:

a_1u_1+\cdots+a_nu_n=0

For each $k$ , since $u_k \in U$ , there exists $S_k \in C$ such that:

u_k \in S_k

Because $C$ is a chain under inclusion, among the finitely many sets:

S_1,\dots,S_n

there is one that contains all the others. Call it $S_m$ .

Then:

u_1,\dots,u_n \in S_m

Since $S_m$ is linearly independent, we have:

a_1=\cdots=a_n=0

Therefore $U$ is linearly independent.

Thus $U \in P$ , and $U$ is an upper bound for the chain $C$ .

By Zorn’s lemma, $P$ has a maximal element $B$ . Then $B$ is a maximal linearly independent subset of $V$ .

We show that $B$ spans $V$ . Suppose not. Then there exists:

v \in V

such that:

v \notin \operatorname{span}(B)

Then:

B \cup \{v\}

is linearly independent. Indeed, if:

a v + b_1b_1' + \cdots + b_nb_n' = 0

is a finite linear relation with $b_1',\dots,b_n' \in B$ , then if $a \ne 0$ , we could solve for $v$ as a linear combination of elements of $B$ , contradicting $v \notin \operatorname{span}(B)$ . Hence $a=0$ , and then all remaining coefficients are zero because $B$ is linearly independent.

Thus $B \cup \{v\}$ is a larger linearly independent set, contradicting the maximality of $B$ .

Therefore:

\operatorname{span}(B)=V

So $B$ is both linearly independent and spanning, hence $B$ is a basis of $V$ .

Summary of Equivalent Forms

Over ZF, the following principles are equivalent:

The axiom of choice.
Every product of nonempty sets is nonempty.
Every surjective function has a right inverse.
Every set can be well ordered.
Zorn’s lemma.
The Hausdorff maximal principle.

Each form is useful in a different setting. The product form is closest to the original statement of choice. The right inverse form is useful in category theoretic and function theoretic arguments. The well ordering theorem is useful for transfinite recursion and comparison of sizes. Zorn’s lemma and the Hausdorff maximal principle are useful for proving existence of maximal objects.