Skip to content

Chapter 123. Coding Theory

Coding theory studies how to represent information so that errors can be detected or corrected.

A message may be transmitted through a noisy channel, stored on an unreliable medium, or processed by a system where bits may be lost or changed. Coding theory adds controlled redundancy to the message so that the original information can still be recovered.

Linear algebra enters coding theory through vector spaces over finite fields. A linear code is a subspace. A generator matrix encodes messages. A parity-check matrix detects errors. Syndrome decoding turns error correction into a problem about cosets and linear equations. Standard coding theory notes describe linear codes through generator matrices, parity-check matrices, and syndrome decoding.

123.1 Messages and Codewords

A message is the original information to be protected.

A codeword is the encoded form of that message.

For example, a binary message may be

m=[101]. m = \begin{bmatrix} 1 & 0 & 1 \end{bmatrix}.

A code may encode it as a longer vector

c=[101101]. c = \begin{bmatrix} 1 & 0 & 1 & 1 & 0 & 1 \end{bmatrix}.

The extra coordinates are redundant. They are added so that errors can be detected or corrected.

The general pattern is:

messagecodeword. \text{message} \longmapsto \text{codeword}.

If the codeword is changed during transmission, the receiver tries to infer the original codeword.

123.2 Alphabets and Finite Fields

Coding theory usually works over a finite alphabet.

The most common case is the binary alphabet

{0,1}. \{0,1\}.

Binary arithmetic is performed in the finite field

F2. \mathbb{F}_2.

In F2\mathbb{F}_2,

1+1=0. 1+1=0.

This means addition is the same as exclusive-or.

More generally, codes may be defined over a finite field

Fq, \mathbb{F}_q,

where qq is a prime power.

A word of length nn over Fq\mathbb{F}_q is a vector in

Fqn. \mathbb{F}_q^n.

Thus coding theory studies selected subsets or subspaces of finite-dimensional vector spaces over finite fields.

123.3 Block Codes

A block code of length nn over Fq\mathbb{F}_q is a subset

CFqn. C \subseteq \mathbb{F}_q^n.

The elements of CC are codewords.

If the code contains MM codewords, then it can represent MM possible messages.

A code is often described by parameters such as

(n,M,d). (n,M,d).

Here:

SymbolMeaning
nnCodeword length
MMNumber of codewords
ddMinimum distance

When the code is linear, the parameters are usually written as

[n,k,d]q. [n,k,d]_q.

Here kk is the dimension of the code as a vector space over Fq\mathbb{F}_q.

123.4 Hamming Distance

The Hamming distance between two words is the number of positions in which they differ.

For

x,yFqn, x,y\in\mathbb{F}_q^n,

the Hamming distance is

d(x,y)={i:xiyi}. d(x,y)=|\{i:x_i\neq y_i\}|.

For example, over F2\mathbb{F}_2,

x=101101,y=100111. x=101101, \qquad y=100111.

They differ in positions 33 and 55, so

d(x,y)=2. d(x,y)=2.

Hamming distance measures how many symbol errors are needed to change one word into another.

123.5 Hamming Weight

The Hamming weight of a word is the number of nonzero coordinates.

It is written as

wt(x)={i:xi0}. \operatorname{wt}(x)=|\{i:x_i\neq 0\}|.

For binary vectors, this is the number of ones.

For example,

x=101101 x=101101

has weight

wt(x)=4. \operatorname{wt}(x)=4.

Over a field,

d(x,y)=wt(xy). d(x,y)=\operatorname{wt}(x-y).

Thus distance is weight of a difference vector.

123.6 Minimum Distance

The minimum distance of a code CC is

dmin=min{d(c,c):c,cC,  cc}. d_{\min} = \min\{d(c,c') : c,c'\in C,\; c\neq c'\}.

It is the smallest distance between distinct codewords.

Minimum distance controls error detection and correction.

If codewords are far apart, then a received word with a small number of errors is still closer to the correct codeword than to any other codeword.

A code with minimum distance dd can detect up to

d1 d-1

errors.

It can correct up to

d12 \left\lfloor \frac{d-1}{2} \right\rfloor

errors by nearest-codeword decoding.

123.7 Linear Codes

A code CFqnC\subseteq \mathbb{F}_q^n is linear if it is a subspace of Fqn\mathbb{F}_q^n.

This means:

c1,c2Cc1+c2C, c_1,c_2\in C \quad\Rightarrow\quad c_1+c_2\in C,

and

aFq,  cCacC. a\in\mathbb{F}_q,\;c\in C \quad\Rightarrow\quad ac\in C.

If CC has dimension kk, then

C=qk. |C|=q^k.

A linear code of length nn, dimension kk, and minimum distance dd is written as

[n,k,d]q. [n,k,d]_q.

Linear codes are useful because their structure can be described by matrices. This makes encoding, checking, and decoding computationally efficient.

123.8 Minimum Distance of a Linear Code

For a linear code, the minimum distance is equal to the minimum nonzero weight:

dmin=min{wt(c):cC,  c0}. d_{\min} = \min\{\operatorname{wt}(c):c\in C,\;c\neq 0\}.

The proof is short.

Since CC is linear, if c,cCc,c'\in C, then

ccC. c-c'\in C.

Therefore

d(c,c)=wt(cc) d(c,c')=\operatorname{wt}(c-c')

is the weight of a nonzero codeword.

Thus the closest pair of distinct codewords is determined by the lightest nonzero codeword.

This is a major simplification.

123.9 Generator Matrices

A generator matrix for a linear [n,k]q[n,k]_q code is a k×nk\times n matrix GG whose rows form a basis of the code.

A message vector

mFqk m\in\mathbb{F}_q^k

is encoded by

c=mG. c=mG.

The resulting vector cFqnc\in\mathbb{F}_q^n is a codeword.

The set of all codewords is

C={mG:mFqk}. C=\{mG:m\in\mathbb{F}_q^k\}.

Thus the code is the row space of GG.

A generator matrix is not unique. Any row-equivalent matrix with the same row space generates the same code.

123.10 Example of Encoding

Let

G=[10110110] G= \begin{bmatrix} 1 & 0 & 1 & 1 \\ 0 & 1 & 1 & 0 \end{bmatrix}

over F2\mathbb{F}_2.

Let the message be

m=[11]. m= \begin{bmatrix} 1 & 1 \end{bmatrix}.

Then

c=mG=[11][10110110]. c=mG = \begin{bmatrix} 1 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 & 1 & 1 \\ 0 & 1 & 1 & 0 \end{bmatrix}.

Compute over F2\mathbb{F}_2:

c=[1101]. c= \begin{bmatrix} 1 & 1 & 0 & 1 \end{bmatrix}.

The third coordinate is 1+1=01+1=0 in F2\mathbb{F}_2.

123.11 Systematic Form

A generator matrix is in systematic form if it has the shape

G=[IkP]. G= \begin{bmatrix} I_k & P \end{bmatrix}.

Here IkI_k is the k×kk\times k identity matrix.

If

m=[m1mk], m= \begin{bmatrix} m_1 & \cdots & m_k \end{bmatrix},

then

mG=[mmP]. mG= \begin{bmatrix} m & mP \end{bmatrix}.

The original message appears directly in the first kk coordinates. The remaining nkn-k coordinates are parity symbols.

Systematic form is convenient because encoding is transparent.

The codeword contains the message plus redundancy.

123.12 Parity-Check Matrices

A parity-check matrix for a linear code CC is a matrix HH such that

C={xFqn:HxT=0}. C=\{x\in\mathbb{F}_q^n:Hx^T=0\}.

Equivalently, a word is a codeword exactly when it satisfies all parity-check equations.

If CC has dimension kk, then HH has nkn-k independent rows.

The rows of HH define linear constraints on codewords. Notes on linear codes describe the rows of a check matrix as a basis for the parity-check equations satisfied by codewords.

123.13 Generator and Parity-Check Matrices

If a generator matrix is in systematic form

G=[IkP], G= \begin{bmatrix} I_k & P \end{bmatrix},

then a parity-check matrix is

H=[PTInk]. H= \begin{bmatrix} -P^T & I_{n-k} \end{bmatrix}.

Over F2\mathbb{F}_2, the minus sign disappears because

1=1. -1=1.

The relation between GG and HH is

GHT=0. GH^T=0.

This means every row of GG is orthogonal to every row of HH.

Equivalently, every generated codeword satisfies all parity checks. Standard references derive parity-check matrices from systematic generator matrices using this relation.

123.14 Example of a Parity-Check Matrix

Let

G=[10110110]=[I2P], G= \begin{bmatrix} 1 & 0 & 1 & 1 \\ 0 & 1 & 1 & 0 \end{bmatrix} = \begin{bmatrix} I_2 & P \end{bmatrix},

where

P=[1110]. P= \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix}.

Over F2\mathbb{F}_2,

H=[PTI2]=[11101001]. H= \begin{bmatrix} P^T & I_2 \end{bmatrix} = \begin{bmatrix} 1 & 1 & 1 & 0 \\ 1 & 0 & 0 & 1 \end{bmatrix}.

Check that

GHT=0. GH^T=0.

Thus HH defines the same code by parity constraints.

A vector xF24x\in\mathbb{F}_2^4 is a codeword if and only if

HxT=0. Hx^T=0.

123.15 Error Detection

Suppose a codeword cc is transmitted.

During transmission, an error vector ee is added:

r=c+e. r=c+e.

The receiver observes rr.

If rr is not a codeword, then an error is detected.

Using a parity-check matrix,

HrT=H(c+e)T. Hr^T=H(c+e)^T.

Since cCc\in C,

HcT=0. Hc^T=0.

Therefore

HrT=HeT. Hr^T=He^T.

If

HrT0, Hr^T\neq 0,

then an error has definitely occurred.

If

HrT=0, Hr^T=0,

then rr is a codeword. Either no error occurred, or the error changed one codeword into another.

123.16 Syndrome

The syndrome of a received word rr is

s=HrT. s=Hr^T.

The syndrome is zero exactly when rr satisfies all parity checks.

Since

r=c+e, r=c+e,

we have

s=H(c+e)T=HcT+HeT=HeT. s=H(c+e)^T=Hc^T+He^T=He^T.

Thus the syndrome depends only on the error vector, not on the transmitted codeword.

This is the key fact behind syndrome decoding. Coding theory notes describe syndrome decoding as replacing a large coset search by computing the syndrome and matching it to an error representative.

123.17 Syndrome Decoding

Syndrome decoding uses a table that maps syndromes to likely error vectors.

The procedure is:

StepOperation
1Receive rr
2Compute syndrome s=HrTs=Hr^T
3Look up likely error ee with HeT=sHe^T=s
4Decode as c=rec=r-e

Over F2\mathbb{F}_2, subtraction equals addition, so

c=r+e. c=r+e.

Usually the table stores the lowest-weight error vector for each syndrome.

This gives nearest-codeword decoding when the number of errors is within the correction capability of the code.

123.18 Cosets

Let CC be a linear code in Fqn\mathbb{F}_q^n.

For any vector aFqna\in\mathbb{F}_q^n, the set

a+C={a+c:cC} a+C=\{a+c:c\in C\}

is a coset of CC.

All vectors in the same coset have the same syndrome.

Indeed, if

r1r2C, r_1-r_2\in C,

then

H(r1r2)T=0, H(r_1-r_2)^T=0,

so

Hr1T=Hr2T. Hr_1^T=Hr_2^T.

Thus syndromes label cosets.

Syndrome decoding chooses a likely error representative, often called a coset leader, for each coset.

123.19 Coset Leaders

A coset leader is a vector of minimum weight in a coset.

In decoding, the coset of the received word represents all possible error patterns with the same syndrome.

The coset leader is chosen as the most likely error when errors are assumed to be rare and independent.

If the received word is

r=c+e, r=c+e,

and the true error ee is the coset leader of its syndrome class, then syndrome decoding recovers cc.

This is the algebraic form of nearest-neighbor decoding.

123.20 The Repetition Code

The binary repetition code of length nn encodes one bit by repeating it nn times.

For n=3n=3,

0000,1111. 0\mapsto 000, \qquad 1\mapsto 111.

The code is

C={000,111}. C=\{000,111\}.

It is a linear [3,1,3]2[3,1,3]_2 code.

A generator matrix is

G=[111]. G= \begin{bmatrix} 1 & 1 & 1 \end{bmatrix}.

A parity-check matrix is

H=[110101]. H= \begin{bmatrix} 1 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix}.

The minimum distance is 33, so the code can correct one error.

For example, if 101101 is received, majority vote decodes it as 111111.

123.21 Single Parity-Check Code

The binary single parity-check code of length nn consists of all vectors with even weight:

C={xF2n:x1+x2++xn=0}. C=\{x\in\mathbb{F}_2^n:x_1+x_2+\cdots+x_n=0\}.

A parity-check matrix is

H=[111]. H= \begin{bmatrix} 1 & 1 & \cdots & 1 \end{bmatrix}.

This is an [n,n1,2]2[n,n-1,2]_2 code.

It can detect one error, because a single bit flip changes parity.

It cannot correct one error, because the syndrome only says that parity is wrong. It does not identify which bit changed.

123.22 Hamming Codes

Hamming codes are classical linear error-correcting codes.

A binary Hamming code with rr parity checks has length

n=2r1 n=2^r-1

and dimension

k=2rr1. k=2^r-r-1.

Its parity-check matrix HH contains all nonzero binary column vectors of length rr.

For r=3r=3, the code is a [7,4,3]2[7,4,3]_2 code.

Its minimum distance is 33, so it corrects one error.

The syndrome of a received word identifies the column of HH corresponding to the error position.

This makes decoding especially simple.

123.23 The [7,4,3][7,4,3] Hamming Code

One parity-check matrix for the [7,4,3]2[7,4,3]_2 Hamming code is

H=[101010101100110001111]. H= \begin{bmatrix} 1 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 1 & 1 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1 \end{bmatrix}.

The columns are the nonzero binary triples.

If a single error occurs in position jj, then

ej e_j

is the unit vector with a 11 in position jj, and

HejT He_j^T

is column jj of HH.

Thus the syndrome directly gives the error location.

After locating the error, flip that bit.

123.24 Dual Codes

The dual code of a linear code CFqnC\subseteq\mathbb{F}_q^n is

C={yFqn:xy=0 for all xC}. C^\perp = \{y\in\mathbb{F}_q^n : x\cdot y=0 \text{ for all } x\in C\}.

The dual code is also a linear code.

If CC has dimension kk, then

C C^\perp

has dimension

nk. n-k.

A parity-check matrix for CC is a generator matrix for CC^\perp. This is the standard dual-code interpretation of parity-check matrices.

123.25 Standard Array

A standard array organizes all vectors in Fqn\mathbb{F}_q^n into cosets of CC.

The first row is the code CC.

Each later row is a coset

a+C, a+C,

where aa is chosen as a coset leader.

Decoding finds the received word in the array and subtracts the row leader.

The standard array is useful conceptually but often too large for practical decoding.

Syndrome decoding stores only the syndrome and coset leader information, which is more compact.

123.26 Bounds

Coding theory studies limits on possible code parameters.

Important bounds include:

BoundMain idea
Singleton boundLimits dd by redundancy
Hamming boundSphere-packing argument
Gilbert-Varshamov boundExistence by covering argument
Plotkin boundStrong restriction for large distance
Griesmer boundBound for linear codes

The Singleton bound says that an [n,k,d]q[n,k,d]_q code satisfies

dnk+1. d\leq n-k+1.

Codes meeting this bound are called maximum distance separable, or MDS codes.

Reed-Solomon codes are important examples of MDS codes.

123.27 Reed-Solomon Codes

Reed-Solomon codes are nonbinary linear codes defined by evaluating polynomials over finite fields.

Choose distinct field elements

a1,,anFq. a_1,\ldots,a_n\in\mathbb{F}_q.

A message determines a polynomial f(x)f(x) of degree less than kk.

The codeword is

(f(a1),f(a2),,f(an)). (f(a_1),f(a_2),\ldots,f(a_n)).

Since two distinct polynomials of degree less than kk can agree in at most k1k-1 points, Reed-Solomon codes have minimum distance

d=nk+1. d=n-k+1.

Thus they meet the Singleton bound.

They are used in storage systems, communication, QR codes, optical media, and distributed systems.

123.28 Erasures

An erasure is an error whose position is known.

For example, a receiver may know that symbol ii was lost but not know its value.

Erasures are easier than unknown errors.

A code with minimum distance dd can correct up to

d1 d-1

erasures.

This is larger than the number of arbitrary errors it can correct:

d12. \left\lfloor \frac{d-1}{2} \right\rfloor.

The reason is that erasures do not require discovering error positions.

Linear algebraic decoding often solves a system of equations for the missing symbols.

123.29 Applications

Coding theory is used wherever information must survive noise or loss.

AreaUse
CommunicationCorrect channel errors
Data storageRecover corrupted bits
Distributed storageReconstruct missing fragments
QR codesCorrect damaged symbols
Deep-space communicationHandle weak noisy signals
Memory systemsECC RAM
Network codingCombine packets algebraically
CryptographyCode-based cryptosystems
DNA storageCorrect synthesis and sequencing errors
Quantum computingQuantum error correction

Modern coding theory includes classical, algebraic, probabilistic, and quantum codes. Recent lecture notes list applications such as reliable communication, data storage, distributed computing, network coding, cryptography, DNA storage, and quantum computing.

123.30 Coding Theory and Linear Algebra

The dictionary is direct.

Coding theoryLinear algebra
WordVector over Fq\mathbb{F}_q
Linear codeSubspace
Code dimensionVector space dimension
Generator matrixBasis matrix for code
EncodingMatrix multiplication
Parity-check matrixLinear constraint matrix
SyndromeImage under HH
Error patternVector
CosetAffine subspace
Dual codeOrthogonal complement
Minimum distanceMinimum nonzero weight
DecodingNearest subspace or coset problem

Coding theory is finite-field linear algebra with a metric.

The metric is Hamming distance rather than Euclidean distance.

123.31 Summary

Coding theory protects information by adding redundancy.

A block code is a set of allowed words. A linear code is a vector subspace of Fqn\mathbb{F}_q^n. Its generator matrix encodes messages by

c=mG. c=mG.

Its parity-check matrix detects valid codewords by

HxT=0. Hx^T=0.

If a received word is

r=c+e, r=c+e,

then its syndrome is

s=HrT=HeT. s=Hr^T=He^T.

The syndrome depends only on the error. Syndrome decoding uses this fact to identify a likely error pattern and recover the transmitted codeword.

The central principle is that redundancy becomes algebraic structure. By choosing a subspace with large Hamming distance, one can detect and correct errors using matrix operations over finite fields.