# 14.1 Arithmetization of Syntax

14.1 Arithmetization of Syntax

Gödel’s key idea is that syntax can be represented inside arithmetic. Symbols, formulas, and proofs are encoded as natural numbers. Once this is done, statements about formulas become statements about numbers, and a formal system can reason about its own expressions.

Coding Symbols

Start with a finite alphabet of symbols:

$$
{0, S, +, \times, =, (, ), \forall, \exists, \neg, \land, \lor, \to, \dots}
$$

Assign each symbol a unique natural number. For example:

Symbol	Code
$0$	$1$
$S$	$2$
$+$	$3$
$\times$	$4$
$=$	$5$
$(,$	$6$
$,)$	$7$
$\forall$	$8$
$\neg$	$9$

The exact assignment does not matter. What matters is that it is fixed and unambiguous.

Coding Sequences

A formula is a finite sequence of symbols. To encode a sequence, we map it to a single number.

A standard method uses prime factorization. Let:

$$
p_1, p_2, p_3, \dots
$$

be the sequence of prime numbers.

If a sequence of symbols has codes:

$$
a_1, a_2, a_3, \dots, a_n
$$

then its Gödel number is:

$$
\mathrm{GN}(a_1,\dots,a_n)

p_1^{a_1} \cdot p_2^{a_2} \cdot p_3^{a_3} \cdots p_n^{a_n}
$$

Each position in the sequence is tied to a distinct prime. The exponent records the symbol at that position.

This encoding is injective. Different sequences produce different numbers because of the uniqueness of prime factorization.

Decoding

Given a number, we can recover the sequence by factoring it into primes:

$$
N = p_1^{a_1} p_2^{a_2} \cdots p_n^{a_n}
$$

The exponents $a_i$ give the original symbol codes.

So the encoding is reversible. Syntax can move back and forth between symbolic form and numeric form.

Encoding Formulas and Proofs

Using this method:

* every term has a code
* every formula has a code
* every finite sequence of formulas has a code

A proof is a finite sequence of formulas. Therefore, every proof also has a Gödel number.

We write:

$$
\ulcorner A \urcorner
$$

for the Gödel number of a formula $A$.

Representing Syntax in Arithmetic

Once formulas and proofs are encoded as numbers, we define relations on numbers that correspond to syntactic properties.

Examples:

* $\mathrm{Formula}(x)$: “$x$ is the code of a well-formed formula”
* $\mathrm{Proof}(x,y)$: “$x$ is the code of a proof of the formula with code $y$”
* $\mathrm{Provable}(y)$: “there exists $x$ such that $\mathrm{Proof}(x,y)$”

These are arithmetic relations. They talk only about natural numbers.

A central fact is that such relations are definable in arithmetic. That means there are formulas in the language of arithmetic that express them.

So the system can internally state things like:

$$
\exists x, \mathrm{Proof}(x, \ulcorner A \urcorner)
$$

which means “$A$ is provable”.

Why This Matters

Arithmetization turns syntax into arithmetic data. This has two crucial consequences.

First, meta-mathematical statements become internal statements. Instead of talking externally about proofs, the system can express “this formula is provable” using arithmetic.

Second, self-reference becomes possible. Since formulas have numeric codes, a formula can refer to its own code.

This prepares the ground for Gödel’s construction. In later sections, we will build a sentence $G$ such that:

$$
G \leftrightarrow \neg \mathrm{Provable}(\ulcorner G \urcorner)
$$

This sentence asserts its own unprovability.

The entire mechanism depends on the fact that syntax can be encoded faithfully inside arithmetic. Without arithmetization, Gödel’s theorems cannot be formulated precisely.