1.1 Syntax: Variables, Connectives

We define the formal language of propositional logic by specifying an alphabet of symbols together with precise rules that determine which expressions are well formed formulas, and this definition is inductive in the sense that every formula is constructed from simpler formulas in a finite number of steps, which ensures that the structure of formulas can always be analyzed in a systematic way.

Alphabet

The alphabet consists of propositional variables, logical connectives, and parentheses, and each of these categories of symbols has a clearly defined role in the construction of formulas.

A countable collection of propositional variables is fixed:

p, q, r, p_1, p_2, \dots

Each variable is treated as an atomic symbol that represents a basic statement, but at the level of syntax no meaning is attached to it, so that the language is purely formal and independent of interpretation.

The logical connectives are:

Symbol	Name	Arity
$¬A$	Negation	unary
$A \land B$	Conjunction	binary
$A \lor B$	Disjunction	binary
$A \to B$	Implication	binary
$A \leftrightarrow B$	Biconditional	binary

Each connective has a fixed arity, and this determines how many formulas it combines, so that unary connectives apply to a single formula while binary connectives combine two formulas into a new one, thereby giving rise to increasingly complex expressions.

Parentheses:

( \quad )

are included in the alphabet in order to make the grouping of expressions explicit, and they ensure that every formula has a well defined syntactic structure without ambiguity.

Definition 1.1 (Formulas)

The set of formulas is defined inductively as follows.

Every propositional variable is a formula, and this provides the base case for the construction.

If $A$ is a formula, then:

¬A

is also a formula, so that negation can be applied to any previously constructed formula to produce a new one.

If $A$ and $B$ are formulas, then:

(A \land B), \quad (A \lor B), \quad (A \to B), \quad (A \leftrightarrow B)

are formulas, which allows binary connectives to combine two formulas into a single larger formula.

No other expressions are formulas, so the set of formulas is exactly the smallest set that contains all variables and is closed under these formation rules, and every formula must arise from finitely many applications of these rules.

Example 1.2

The following expressions are formulas:

p

¬p

(p \land q)

(p \lor (¬q))

((p \land q) \to r)

(¬(p \lor q) \leftrightarrow r)

Each of these expressions can be obtained by starting from variables and repeatedly applying the formation rules, and at every step the resulting expression remains a well formed formula.

The following expressions are not formulas:

\land p q

(p \land)

p q \to

((p \lor q)

Each of these fails to satisfy the inductive construction rules, either because a connective is used without the required number of arguments or because the parentheses do not properly match, so that the expression cannot be built using the allowed operations.

Lemma 1.3 (Unique Readability)

Every formula admits a unique decomposition according to the formation rules, which means that there is exactly one way to parse the formula into its constituent parts, and this guarantees that the syntactic structure of a formula is well defined.

More precisely, every formula is exactly one of the following: a variable, or of the form $¬A$ for a unique formula $A$ , or of the form $(A \circ B)$ for unique formulas $A$ and $B$ and a unique connective $\circ$ .

Proof

The statement is proved by induction on the construction of formulas, and the inductive nature of the definition ensures that every formula arises in a controlled and finite manner from simpler formulas.

If the formula is a variable, then it has no internal structure, and therefore its decomposition is immediate and unique.

If the formula is obtained by applying negation, then it must have the form $¬A$ for some formula $A$ , and since the construction step that introduces $¬$ is uniquely determined, the formula $A$ is uniquely determined as well, so that the decomposition is unique.

If the formula is obtained by applying a binary connective, then it must have the form $(A \circ B)$ for some formulas $A$ and $B$ , and since the construction step specifies both the connective and its two arguments, the pair $(A, B)$ is uniquely determined, and hence the decomposition is unique in this case also.

Convention 1.4 (Precedence and Associativity)

Parentheses may be omitted according to a fixed precedence order:

¬ \;>\; \land \;>\; \lor \;>\; \to \;>\; \leftrightarrow

This convention means that negation binds most strongly, followed by conjunction, then disjunction, then implication, and finally biconditional, and this ordering allows formulas to be written more compactly without introducing ambiguity.

For example:

¬p \land q \to r

is interpreted as:

((¬p \land q) \to r)

Binary connectives are taken to associate to the right unless parentheses specify otherwise, so that:

p \to q \to r

is interpreted as:

p \to (q \to r)

This convention ensures that every formula still has a unique parsing even when some parentheses are omitted.

Definition 1.5 (Subformula)

The notion of subformula is defined recursively in a way that reflects the inductive construction of formulas, and it captures all parts that occur within a given formula.

If $A$ is a variable, then its only subformula is $A$ itself.

If $A = ¬B$ , then the subformulas of $A$ consist of the formula $A$ together with all subformulas of $B$ , so that every component of $B$ is also regarded as a component of $A$ .

If $A = (B \circ C)$ , then the subformulas of $A$ consist of the formula $A$ together with all subformulas of $B$ and all subformulas of $C$ , so that the entire structure of $A$ is captured by its recursive decomposition.

Definition 1.6 (Length)

The length of a formula is defined as the number of symbols that occur in it, and this provides a simple numerical measure of the size of the formula that can be used in inductive arguments.

For example:

(p \land q)

has length $5$ when each symbol is counted individually, including parentheses and connectives.

Definition 1.7 (Depth)

The depth of a formula measures the level of nesting of connectives, and it is defined inductively so that more complex formulas have greater depth.

If $A$ is a variable, then:

\mathrm{depth}(A) = 0

If $A = ¬B$ , then:

\mathrm{depth}(A) = \mathrm{depth}(B) + 1

If $A = (B \circ C)$ , then:

\mathrm{depth}(A) = \max(\mathrm{depth}(B), \mathrm{depth}(C)) + 1

This definition reflects the hierarchical structure of formulas and allows one to reason about their complexity.

Lemma 1.8 (Tree Representation)

Every formula can be represented as a finite rooted tree whose leaves correspond to variables and whose internal nodes are labeled by connectives, and this representation makes the recursive structure of formulas explicit.

Proof

The representation is constructed by induction on formulas, and at each step the structure of the formula determines the corresponding tree uniquely.

A variable corresponds to a single node with no children, which serves as a leaf of the tree.

If $A = ¬B$ , then the tree corresponding to $A$ is obtained by introducing a new root node labeled $¬$ whose single child is the tree corresponding to $B$ , so that the structure of negation is reflected directly in the tree.

If $A = (B \circ C)$ , then the tree corresponding to $A$ is obtained by introducing a new root node labeled $\circ$ whose two children are the trees corresponding to $B$ and $C$ , and this construction preserves the hierarchical structure of the formula.

Lemma 1.9 (Structural Induction)

Let $P(A)$ be a property of formulas, and suppose that $P$ satisfies the following conditions: it holds for every variable, it is preserved under negation, and it is preserved under each binary connective, then $P(A)$ holds for all formulas $A$ .

Proof

The proof follows from the inductive definition of formulas, since every formula is constructed from variables by finitely many applications of the formation rules, and at each step the property $P$ is preserved by assumption, so that by induction the property holds for all formulas.