Skip to content

Chapter 93. Power Iteration

Power iteration is an iterative method for approximating a dominant eigenvalue and a corresponding eigenvector of a square matrix. It is one of the simplest eigenvalue algorithms in numerical linear algebra.

The method applies the matrix repeatedly to a vector:

v,Av,A2v,A3v,. v,\quad Av,\quad A^2v,\quad A^3v,\ldots.

If one eigenvalue has strictly larger magnitude than all the others, repeated multiplication by AA amplifies the component in that dominant eigendirection. After normalization, the vector sequence tends toward a dominant eigenvector. The eigenvalue can then be estimated by a scalar quotient such as the Rayleigh quotient. Power iteration is simple and may converge slowly, but it is effective for large sparse or matrix-free problems because each step mainly requires one matrix-vector product.

93.1 Dominant Eigenvalue

Let

ARn×n. A\in \mathbb{R}^{n\times n}.

An eigenpair of AA is a scalar λ\lambda and a nonzero vector vv satisfying

Av=λv. Av=\lambda v.

An eigenvalue λ1\lambda_1 is called dominant if

λ1>λj |\lambda_1|>|\lambda_j|

for every other eigenvalue λj\lambda_j.

The power method is designed to approximate this dominant eigenvalue and an associated eigenvector.

93.2 Basic Idea

Suppose AA has eigenvectors

v1,v2,,vn v_1,v_2,\ldots,v_n

with eigenvalues

λ1,λ2,,λn, \lambda_1,\lambda_2,\ldots,\lambda_n,

ordered so that

λ1>λ2λn. |\lambda_1|>|\lambda_2|\ge \cdots \ge |\lambda_n|.

Suppose the initial vector x(0)x^{(0)} has a nonzero component in the direction of v1v_1. Then

x(0)=c1v1+c2v2++cnvn,c10. x^{(0)} = c_1v_1+c_2v_2+\cdots+c_nv_n, \qquad c_1\ne 0.

Applying AkA^k gives

Akx(0)=c1λ1kv1+c2λ2kv2++cnλnkvn. A^kx^{(0)} = c_1\lambda_1^kv_1 + c_2\lambda_2^kv_2 + \cdots + c_n\lambda_n^kv_n.

Factor out the dominant term:

Akx(0)=λ1k(c1v1+c2(λ2λ1)kv2++cn(λnλ1)kvn). A^kx^{(0)} = \lambda_1^k \left( c_1v_1 + c_2\left(\frac{\lambda_2}{\lambda_1}\right)^kv_2 + \cdots + c_n\left(\frac{\lambda_n}{\lambda_1}\right)^kv_n \right).

Since

λjλ1<1 \left|\frac{\lambda_j}{\lambda_1}\right|<1

for j>1j>1, the non-dominant terms decay relative to the dominant term.

This is the mechanism behind power iteration.

93.3 Normalization

Directly computing

Akx(0) A^kx^{(0)}

is numerically unsafe. The vector may grow without bound or underflow toward zero.

Power iteration normalizes after each multiplication.

A standard form is:

y(k+1)=Ax(k), y^{(k+1)}=Ax^{(k)}, x(k+1)=y(k+1)y(k+1). x^{(k+1)} = \frac{y^{(k+1)}}{\|y^{(k+1)}\|}.

The normalization keeps the vector at manageable scale.

It does not change the direction of the vector.

Since eigenvectors are defined only up to nonzero scalar multiples, preserving direction is sufficient.

93.4 Basic Algorithm

Choose an initial nonzero vector

x(0). x^{(0)}.

For

k=0,1,2, k=0,1,2,\ldots

compute

y(k+1)=Ax(k). y^{(k+1)}=Ax^{(k)}.

Normalize:

x(k+1)=y(k+1)y(k+1)2. x^{(k+1)} = \frac{y^{(k+1)}}{\|y^{(k+1)}\|_2}.

Estimate the eigenvalue using the Rayleigh quotient:

μk+1=(x(k+1))TAx(k+1)(x(k+1))Tx(k+1). \mu_{k+1} = \frac{(x^{(k+1)})^TAx^{(k+1)}}{(x^{(k+1)})^Tx^{(k+1)}}.

If x(k+1)x^{(k+1)} is normalized, this reduces to

μk+1=(x(k+1))TAx(k+1). \mu_{k+1} = (x^{(k+1)})^TAx^{(k+1)}.

The Rayleigh quotient converges to the dominant eigenvalue when the vector converges to the corresponding eigenvector.

93.5 Pseudocode

power_iteration(A, x, tol, max_iter):
    x = x / norm(x)

    for k = 0 to max_iter - 1:
        y = A * x

        if norm(y) == 0:
            error "zero vector encountered"

        x_new = y / norm(y)

        lambda_new = dot(x_new, A * x_new) / dot(x_new, x_new)

        r = A * x_new - lambda_new * x_new

        if norm(r) <= tol:
            return lambda_new, x_new

        x = x_new

    return lambda_new, x

The residual

r=Axμx r=Ax-\mu x

measures how close (μ,x)(\mu,x) is to an eigenpair.

93.6 Example

Consider

A=[2001]. A= \begin{bmatrix} 2 & 0\\ 0 & 1 \end{bmatrix}.

The eigenvalues are

2and1. 2 \qquad \text{and} \qquad 1.

The dominant eigenvalue is

λ1=2. \lambda_1=2.

Let

x(0)=[11]. x^{(0)} = \begin{bmatrix} 1\\ 1 \end{bmatrix}.

Then

Ax(0)=[21]. Ax^{(0)} = \begin{bmatrix} 2\\ 1 \end{bmatrix}.

Next,

A2x(0)=[41]. A^2x^{(0)} = \begin{bmatrix} 4\\ 1 \end{bmatrix}.

Then

A3x(0)=[81]. A^3x^{(0)} = \begin{bmatrix} 8\\ 1 \end{bmatrix}.

After normalization, the direction approaches

[10], \begin{bmatrix} 1\\ 0 \end{bmatrix},

which is the eigenvector associated with λ1=2\lambda_1=2.

The second component becomes negligible relative to the first because

(12)k \left(\frac{1}{2}\right)^k

decays to zero.

93.7 Convergence Rate

The convergence rate is controlled by the ratio

λ2λ1. \left|\frac{\lambda_2}{\lambda_1}\right|.

If this ratio is small, convergence is fast.

If this ratio is close to 11, convergence is slow.

For example:

RatioBehavior
0.10.1fast convergence
0.50.5moderate convergence
0.950.95slow convergence
11no spectral separation

Power iteration converges geometrically under the usual assumptions, with rate governed by the dominant eigenvalue gap.

93.8 Spectral Gap

The spectral gap is the separation between the dominant eigenvalue and the remaining eigenvalues.

For power iteration, the important quantity is often not the difference

λ1λ2, |\lambda_1|-|\lambda_2|,

but the ratio

λ2λ1. \frac{|\lambda_2|}{|\lambda_1|}.

A large gap means

λ2λ1 \frac{|\lambda_2|}{|\lambda_1|}

is small.

A small gap means the ratio is close to 11.

The method becomes slow when the dominant eigenvalue is only slightly larger in magnitude than the next eigenvalue.

93.9 Sign and Phase Behavior

If the dominant eigenvalue is positive, the normalized iterates often converge directly to a dominant eigenvector.

If the dominant eigenvalue is negative, signs may alternate:

x(k)(1)kv1. x^{(k)}\approx (-1)^k v_1.

The direction still converges, but the vector may alternate between v1v_1 and v1-v_1.

For complex eigenvalues, a phase factor may appear. The direction may converge in a projective sense even when the raw vector sequence does not converge in the ordinary sense.

93.10 Rayleigh Quotient

The Rayleigh quotient of a nonzero vector xx is

ρ(x)=xTAxxTx. \rho(x)=\frac{x^TAx}{x^Tx}.

If xx is an eigenvector, then

Ax=λx, Ax=\lambda x,

so

ρ(x)=xTλxxTx=λ. \rho(x) = \frac{x^T\lambda x}{x^Tx} = \lambda.

Thus the Rayleigh quotient gives an eigenvalue estimate from an approximate eigenvector.

For symmetric matrices, the Rayleigh quotient has especially strong approximation properties.

93.11 Residual Test

Given an approximate eigenpair (μ,x)(\mu,x), the residual is

r=Axμx. r=Ax-\mu x.

If

r=0, r=0,

then (μ,x)(\mu,x) is an exact eigenpair.

A practical stopping rule is

Axμx2τ. \|Ax-\mu x\|_2 \le \tau.

A relative version is

Axμx2A2x2τ. \frac{\|Ax-\mu x\|_2}{\|A\|_2\|x\|_2} \le \tau.

Residual-based stopping is more meaningful than stopping only when successive iterates change little.

93.12 Choice of Initial Vector

The initial vector must have a nonzero component in the dominant eigendirection.

If

x(0) x^{(0)}

is orthogonal to the dominant eigenvector in a way that removes the dominant component, power iteration cannot recover that component.

In practice, a random initial vector is often used. For ordinary finite-dimensional problems, a random vector has nonzero component in any fixed direction with probability one.

93.13 Failure Cases

Power iteration may fail or behave poorly in several cases.

CaseBehavior
No unique dominant eigenvaluemay not converge to one direction
(\lambda_1
Initial vector lacks dominant componentwrong invariant subspace
Defective matrixconvergence may be altered
Dominant complex pairreal iteration may oscillate or rotate
Non-normal matrixeigenvalue behavior may be misleading

The method is reliable only when its assumptions match the matrix.

93.14 Defective and Non-Diagonalizable Matrices

The cleanest convergence analysis assumes a diagonalizable matrix.

If AA is defective, the Jordan form contains nontrivial Jordan blocks. The power AkA^k may contain polynomial factors in kk, not only powers of eigenvalues.

When a unique dominant eigenvalue exists, power iteration may still converge under suitable conditions, but the analysis is more delicate.

The diagonalizable case gives the main intuition.

93.15 Symmetric Matrices

For real symmetric matrices, power iteration has a simpler theory.

A symmetric matrix has an orthonormal eigenbasis. Therefore any initial vector can be expanded as

x(0)=c1q1+c2q2++cnqn. x^{(0)} = c_1q_1+c_2q_2+\cdots+c_nq_n.

Then

Akx(0)=c1λ1kq1+c2λ2kq2++cnλnkqn. A^kx^{(0)} = c_1\lambda_1^kq_1+ c_2\lambda_2^kq_2+ \cdots+ c_n\lambda_n^kq_n.

Orthogonality removes many complications present in non-normal matrices.

Symmetric power iteration is therefore easier to analyze and often more numerically predictable.

93.16 Cost per Iteration

Each power iteration step mainly requires one matrix-vector product.

For a dense matrix,

Ax Ax

costs

O(n2). O(n^2).

For a sparse matrix, the cost is approximately

O(nnz(A)). O(\operatorname{nnz}(A)).

The storage requirement is small:

ObjectStorage
Matrix AAproblem-dependent
Current vector xxnn
Product vector y=Axy=Axnn
Optional residualnn

This low memory usage is one of the main strengths of the method.

93.17 Matrix-Free Power Iteration

Power iteration does not require direct access to all entries of AA.

It only requires a routine that computes

xAx. x\mapsto Ax.

This is useful when AA is an operator rather than an explicitly stored matrix.

Examples include:

SettingMatrix-vector operation
graph algorithmsmultiply by adjacency or transition matrix
PDE solversapply discretized differential operator
statisticsapply covariance matrix implicitly
machine learningapply Hessian-vector product
web rankingmultiply by link transition matrix

This is why power iteration remains relevant despite its simplicity.

93.18 PageRank Connection

PageRank can be interpreted as an eigenvector problem for a large transition matrix.

The target vector is a stationary distribution satisfying an equation of the form

p=Gp, p = Gp,

where GG is a stochastic matrix modified by damping.

Power iteration is natural here because the matrix is enormous and sparse, and the main operation is repeated multiplication by the transition operator. Power iteration is historically associated with large sparse web matrices and PageRank-style computations.

93.19 Inverse Power Iteration

Power iteration finds the eigenvalue of largest magnitude.

Inverse power iteration applies power iteration to

A1. A^{-1}.

The eigenvalues of A1A^{-1} are

1λi. \frac{1}{\lambda_i}.

Thus the largest eigenvalue of A1A^{-1} corresponds to the smallest magnitude eigenvalue of AA.

Each step requires solving

Ay(k+1)=x(k) Ay^{(k+1)}=x^{(k)}

rather than multiplying by AA.

This is more expensive, but it targets small eigenvalues.

93.20 Shifted Inverse Iteration

To find an eigenvalue near a chosen shift σ\sigma, apply inverse iteration to

AσI. A-\sigma I.

The eigenvalues of

(AσI)1 (A-\sigma I)^{-1}

are

1λiσ. \frac{1}{\lambda_i-\sigma}.

The largest magnitude corresponds to the eigenvalue λi\lambda_i closest to σ\sigma.

Thus shifted inverse iteration can target interior eigenvalues, not only extremal ones. This shift idea is a standard extension of the power method.

93.21 Rayleigh Quotient Iteration

Rayleigh quotient iteration updates the shift dynamically using the Rayleigh quotient.

At step kk, set

σk=ρ(x(k)). \sigma_k=\rho(x^{(k)}).

Then solve

(AσkI)y(k+1)=x(k). (A-\sigma_k I)y^{(k+1)}=x^{(k)}.

Normalize:

x(k+1)=y(k+1)y(k+1). x^{(k+1)}=\frac{y^{(k+1)}}{\|y^{(k+1)}\|}.

For symmetric matrices, this method can converge very rapidly when started sufficiently close to an eigenvector.

It is more expensive per iteration because it requires solving shifted linear systems.

93.22 Deflation

Power iteration gives one eigenpair.

To compute additional eigenpairs, one may use deflation.

For a symmetric matrix with normalized dominant eigenvector q1q_1, define

B=Aλ1q1q1T. B=A-\lambda_1 q_1q_1^T.

Then BB removes the contribution of the first eigenpair.

Power iteration can then be applied to BB to find the next eigenpair.

Deflation must be used carefully because numerical errors in the first eigenvector can affect later eigenpairs.

93.23 Relation to Krylov Subspaces

Power iteration generates the sequence

x(0),Ax(0),A2x(0),. x^{(0)},\quad Ax^{(0)},\quad A^2x^{(0)},\ldots.

These vectors span a Krylov subspace:

Kk(A,x(0))=span{x(0),Ax(0),,Ak1x(0)}. \mathcal{K}_k(A,x^{(0)}) = \operatorname{span} \{x^{(0)},Ax^{(0)},\ldots,A^{k-1}x^{(0)}\}.

Power iteration uses only the newest vector. More advanced Krylov methods retain and orthogonalize several vectors from this sequence.

Thus Arnoldi and Lanczos may be viewed as more sophisticated descendants of power iteration.

93.24 Block Power Iteration

Block power iteration uses several starting vectors at once.

Let

X(0)Rn×m. X^{(0)}\in\mathbb{R}^{n\times m}.

Compute

Y(k+1)=AX(k). Y^{(k+1)}=AX^{(k)}.

Then orthonormalize the columns:

X(k+1)=orth(Y(k+1)). X^{(k+1)}=\operatorname{orth}(Y^{(k+1)}).

Block power iteration approximates an invariant subspace associated with the largest mm eigenvalues in magnitude.

It is useful when several leading eigenvectors are needed.

93.25 Practical Stopping Criteria

Common stopping rules include:

CriterionForm
Eigenpair residualAxμxτ\|Ax-\mu x\|\le \tau
Relative residualAxμx/(Ax)τ\|Ax-\mu x\|/(\|A\|\|x\|)\le \tau
Eigenvalue change(
Direction changex(k+1)±x(k)τ\|x^{(k+1)}\pm x^{(k)}\|\le \tau
Maximum iterationskkmaxk\le k_{\max}

The sign ambiguity in eigenvectors explains the use of

x(k+1)±x(k). x^{(k+1)}\pm x^{(k)}.

For negative dominant eigenvalues, consecutive normalized vectors may alternate signs.

93.26 Numerical Considerations

A practical implementation should account for:

IssueHandling
Overflow or underflownormalize every step
Zero product Ax=0Ax=0stop or change initial vector
Sign ambiguitycompare directions, not raw vectors
Slow convergenceuse shift, inverse iteration, or Krylov methods
Sparse matrixuse sparse matrix-vector product
Non-normalitycheck residual, not only eigenvalue change
Multiple dominant eigenvaluesuse block methods or Arnoldi

The simplicity of power iteration makes it attractive, but its diagnostics must be chosen carefully.

93.27 Summary

Power iteration approximates the dominant eigenpair of a matrix by repeated multiplication and normalization:

y(k+1)=Ax(k),x(k+1)=y(k+1)y(k+1). y^{(k+1)}=Ax^{(k)}, \qquad x^{(k+1)} = \frac{y^{(k+1)}}{\|y^{(k+1)}\|}.

The eigenvalue is commonly estimated by the Rayleigh quotient:

μk=(x(k))TAx(k)(x(k))Tx(k). \mu_k = \frac{(x^{(k)})^TAx^{(k)}}{(x^{(k)})^Tx^{(k)}}.

The method converges when there is a unique dominant eigenvalue and the starting vector has a nonzero component in its eigendirection. Its convergence rate is controlled by

λ2λ1. \left|\frac{\lambda_2}{\lambda_1}\right|.

Power iteration is simple, memory-light, and suitable for sparse or matrix-free problems. Its main weakness is slow convergence when the dominant eigenvalue is poorly separated from the rest of the spectrum.