A Markov chain is a model for random movement among a set of states.
At each step, the process occupies one state. It then moves to another state according to fixed transition probabilities. The defining condition is that the next state depends only on the current state, not on the full history.
This is called the Markov property.
If
is a sequence of random states, then the Markov property says
For finite state spaces, a Markov chain is naturally described by a matrix. This makes Markov chains one of the clearest applications of linear algebra to probability. The transition matrix stores one-step probabilities, its powers describe multi-step transitions, and its eigenvectors describe long-term behavior.
112.1 States
The possible values of a Markov chain are called states.
For a finite Markov chain, the state space is a finite set
At each time , the random variable takes one value in .
For example, a simple weather model may have three states:
| State | Meaning |
|---|---|
| 1 | Sunny |
| 2 | Cloudy |
| 3 | Rainy |
If , then the model says that the weather on day is cloudy.
The purpose of the Markov chain is to describe how the state changes over time.
112.2 Transition Probabilities
The probability of moving from state to state in one step is written
For a time-homogeneous Markov chain, this probability does not depend on . The same transition rule is used at every step.
For each fixed starting state , the probabilities satisfy
Each probability is nonnegative:
These two conditions express a basic fact: from state , the chain must move to some state in the state space.
112.3 Transition Matrices
The transition probabilities form an matrix
The entry is the probability of moving from state to state .
Since every row sums to , is a row-stochastic matrix. A finite-state Markov chain can be represented by such a transition matrix.
For example,
describes a chain with two states. From state , the chain stays in state with probability and moves to state with probability . From state , it moves to state with probability and stays in state with probability .
112.4 Probability Distributions as Vectors
A probability distribution on states can be written as a row vector
The entry is the probability that the chain is in state .
The entries satisfy
and
If the initial distribution is , then the distribution after one step is
After two steps,
After steps,
Thus powers of the transition matrix describe the evolution of the Markov chain.
112.5 Multi-Step Transition Probabilities
The entry is the probability that the chain moves from state to state in steps.
For ,
This formula has a direct probabilistic interpretation. To go from to in two steps, the chain may pass through any intermediate state . The term is the probability of the path
Summing over all gives the total probability.
The same reasoning extends to . Matrix multiplication is exactly the algebra of composing transition probabilities.
112.6 Example: Two-State Chain
Consider the transition matrix
Suppose the chain starts in state with probability . Then
After one step,
After two steps,
Therefore
The probability of state has increased from to , then to .
Further powers of continue the evolution.
112.7 Stationary Distributions
A stationary distribution is a probability vector that remains unchanged after applying the transition matrix:
This equation says that if the chain starts with distribution , then it has the same distribution after one step.
In linear algebra terms, is a left eigenvector of with eigenvalue , normalized so that its entries sum to .
The stationary distribution describes equilibrium behavior.
It does not say that the state itself becomes fixed. It says that the probability distribution over states becomes fixed.
112.8 Computing a Stationary Distribution
To find a stationary distribution, solve
Equivalently,
Using column vectors, this may be written as
The solution must also satisfy
For the matrix
let
The equation gives
These equations are equivalent. From the second,
so
Since ,
so
Thus
This distribution is unchanged by .
112.9 Convergence to Equilibrium
A major question is whether
converges to a stationary distribution as .
For many finite Markov chains, the answer is yes. In particular, if the chain is irreducible and aperiodic, then it has a unique stationary distribution and the distribution converges to it.
Irreducibility means that every state can eventually reach every other state.
Aperiodicity means that the chain does not move in a fixed repeating cycle.
Under these conditions, long-term behavior becomes independent of the initial distribution.
112.10 Eigenvalues and Long-Term Behavior
The long-term behavior of a Markov chain is controlled by the eigenvalues of its transition matrix.
Since is stochastic, is an eigenvalue. The corresponding eigenvector is related to the stationary distribution.
Other eigenvalues determine the rate of convergence.
If the eigenvalues satisfy
then the magnitude of often controls the dominant rate of convergence toward equilibrium.
When is close to , convergence is slow.
When is much smaller than , convergence is fast.
This is why spectral methods are important in the study of Markov chains.
112.11 Absorbing States
A state is absorbing if, once the chain enters it, the chain remains there forever.
This means
Equivalently, row of has a in column and zeros elsewhere.
For example,
has state absorbing.
Absorbing chains are used to model completion, failure, extinction, ruin, termination, and hitting events.
Linear algebra can compute absorption probabilities and expected times to absorption.
112.12 Canonical Form for Absorbing Chains
For an absorbing chain, reorder the states so that transient states come first and absorbing states come last.
Then the transition matrix can be written in block form:
Here:
| Block | Meaning |
|---|---|
| Transitions among transient states | |
| Transitions from transient to absorbing states | |
| Absorbing states remain fixed |
The matrix
is called the fundamental matrix.
The entry gives the expected number of visits to transient state , starting from transient state , before absorption.
The matrix
gives absorption probabilities.
Thus inverse matrices and block matrices provide exact formulas for absorbing Markov chains.
112.13 Random Walks on Graphs
A random walk on a graph is a Markov chain whose states are vertices.
At each step, the process moves from the current vertex to a neighboring vertex.
For an unweighted graph with adjacency matrix and degree matrix , the transition matrix is
assuming every vertex has positive degree.
The entry is
This connects Markov chains with graph theory.
Random walks are used in ranking, sampling, diffusion, clustering, and network analysis.
112.14 Reversibility
A Markov chain with stationary distribution is reversible if
for all states and .
This condition is called detailed balance.
It says that, at equilibrium, the probability flow from to equals the probability flow from to .
Reversible Markov chains have strong linear algebraic structure. In a weighted inner product determined by , the transition operator is self-adjoint. This makes spectral analysis especially useful.
112.15 Doubly Stochastic Matrices
A transition matrix is doubly stochastic if every row and every column sums to .
That is,
for every row , and
for every column .
For an doubly stochastic matrix, the uniform distribution
is stationary.
This follows because column sums equal . Thus
Doubly stochastic chains arise in shuffling, averaging, consensus algorithms, and symmetric random processes.
112.16 Periodicity
A state has period if returns to that state can occur only at times divisible by , and is the greatest such divisor.
A chain is aperiodic if all relevant states have period .
Periodicity can prevent convergence even when a stationary distribution exists.
For example,
alternates between two states.
It has stationary distribution
but if the chain starts in state , the distribution alternates between
and
Thus does not converge.
112.17 Reducibility
A chain is reducible if its state space can be decomposed into classes that do not all communicate with each other.
In matrix terms, after a suitable permutation of states, the transition matrix can often be written in block triangular form.
For example,
means that states in the second block cannot move back to states in the first block.
Reducibility affects stationary distributions and long-term behavior.
Some reducible chains have many stationary distributions. For example, the identity transition matrix leaves every probability distribution unchanged.
112.18 Markov Chains and Linear Operators
A transition matrix acts as a linear operator on probability distributions.
For row vectors,
For column vectors, one may instead use
The operator preserves total probability and nonnegativity.
This means it maps the probability simplex
to itself.
The fixed points of this operator are stationary distributions.
Thus Markov chains combine geometry, probability, and linear algebra.
112.19 Markov Chains in Applications
Markov chains appear whenever a system changes state probabilistically over time.
| Field | Example |
|---|---|
| Web search | Random surfer models and PageRank |
| Statistics | Markov chain Monte Carlo |
| Physics | Random walks and diffusion |
| Biology | Population genetics |
| Finance | Credit rating transitions |
| Queueing theory | System state evolution |
| Computer science | Randomized algorithms |
| Machine learning | Sampling and sequence models |
| Chemistry | Molecular state transitions |
| Operations research | Inventory and reliability models |
The matrix viewpoint makes these applications computable.
Large Markov chains are often sparse. Efficient computation then depends on sparse matrix multiplication, iterative eigenvalue methods, and numerical stability.
112.20 Summary
A finite Markov chain is a probability model whose state changes according to a transition matrix.
The matrix stores one-step transition probabilities. Its powers store -step transition probabilities. A probability distribution evolves by
A stationary distribution satisfies
This is an eigenvector equation with eigenvalue .
The long-term behavior of the chain depends on irreducibility, aperiodicity, absorbing structure, and the spectrum of the transition matrix.
Markov chains show how linear algebra describes random processes. A stochastic system becomes a sequence of matrix multiplications, and its equilibrium becomes an eigenvector problem.