Skip to content

Chapter 77. Cholesky Decomposition

Cholesky decomposition is a triangular factorization for symmetric positive definite matrices. It is a specialized form of matrix decomposition, more restrictive than LU decomposition, but faster and simpler when its hypotheses are satisfied.

For a real symmetric positive definite matrix AA, the Cholesky decomposition has the form

A=LLT, A = LL^T,

where LL is lower triangular with positive diagonal entries. For a complex Hermitian positive definite matrix, the corresponding form is

A=LL, A = LL^*,

where LL^* denotes the conjugate transpose of LL. Every Hermitian positive definite matrix has such a decomposition, and the factor is unique when the diagonal entries of LL are required to be positive.

77.1 Symmetric Positive Definite Matrices

A real square matrix AA is symmetric if

AT=A. A^T = A.

It is positive definite if

xTAx>0 x^T A x > 0

for every nonzero vector xx.

Positive definiteness is a strong condition. It says that the quadratic form associated with AA is always positive away from the origin. This condition appears naturally in geometry, optimization, statistics, least squares, energy methods, covariance matrices, and numerical partial differential equations.

For Cholesky decomposition, symmetry alone is insufficient. Positive definiteness is also required. A symmetric matrix with negative or zero curvature in some direction may fail to have a standard Cholesky factorization.

77.2 The Factorization

The Cholesky factorization of a real symmetric positive definite matrix is

A=LLT. A = LL^T.

Here

L=[110002122003132330n1n2n3nn], L = \begin{bmatrix} \ell_{11} & 0 & 0 & \cdots & 0 \\ \ell_{21} & \ell_{22} & 0 & \cdots & 0 \\ \ell_{31} & \ell_{32} & \ell_{33} & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \ell_{n1} & \ell_{n2} & \ell_{n3} & \cdots & \ell_{nn} \end{bmatrix},

with

ii>0 \ell_{ii} > 0

for every ii.

The transpose LTL^T is upper triangular. Thus Cholesky writes AA as the product of a lower triangular matrix and its transpose.

This is different from ordinary LU decomposition. In LU decomposition, the two triangular factors are generally unrelated:

A=LU. A = LU.

In Cholesky decomposition, the upper factor is exactly the transpose of the lower factor:

U=LT. U = L^T.

This symmetry is the source of Cholesky’s efficiency.

77.3 Relation to LU Decomposition

Cholesky decomposition may be viewed as LU decomposition adapted to symmetric positive definite matrices.

If

A=LLT, A = LL^T,

then AA is automatically symmetric:

AT=(LLT)T=LLT=A. A^T = (LL^T)^T = LL^T = A.

It is also positive definite when LL is invertible. Indeed, for any nonzero vector xx,

xTAx=xTLLTx=(LTx)T(LTx)=LTx2. x^T A x = x^T LL^T x = (L^T x)^T(L^T x) = \|L^T x\|^2.

Since LL is invertible, LTx0L^T x \ne 0 whenever x0x \ne 0. Hence

xTAx>0. x^T A x > 0.

Thus every matrix of the form LLTLL^T, with LL nonsingular, is symmetric positive definite.

The converse is the Cholesky theorem: every real symmetric positive definite matrix can be written in this way.

77.4 A Two by Two Example

Let

A=[46613]. A = \begin{bmatrix} 4 & 6 \\ 6 & 13 \end{bmatrix}.

We seek

A=LLT, A = LL^T,

where

L=[a0bc],a>0,c>0. L = \begin{bmatrix} a & 0 \\ b & c \end{bmatrix}, \qquad a > 0, \qquad c > 0.

Then

LLT=[a0bc][ab0c]=[a2ababb2+c2]. LL^T = \begin{bmatrix} a & 0 \\ b & c \end{bmatrix} \begin{bmatrix} a & b \\ 0 & c \end{bmatrix} = \begin{bmatrix} a^2 & ab \\ ab & b^2 + c^2 \end{bmatrix}.

Equating entries with AA, we get

a2=4, a^2 = 4,

so

a=2. a = 2.

Next,

ab=6, ab = 6,

so

2b=6, 2b = 6,

and therefore

b=3. b = 3.

Finally,

b2+c2=13. b^2 + c^2 = 13.

Thus

9+c2=13, 9 + c^2 = 13,

so

c=2. c = 2.

Therefore

L=[2032]. L = \begin{bmatrix} 2 & 0 \\ 3 & 2 \end{bmatrix}.

Indeed,

[2032][2302]=[46613]. \begin{bmatrix} 2 & 0 \\ 3 & 2 \end{bmatrix} \begin{bmatrix} 2 & 3 \\ 0 & 2 \end{bmatrix} = \begin{bmatrix} 4 & 6 \\ 6 & 13 \end{bmatrix}.

77.5 Entry Formulas

Let AA be an n×nn \times n real symmetric positive definite matrix. Suppose

A=LLT. A = LL^T.

The entries of LL can be computed column by column.

For a diagonal entry,

jj=ajjk=1j1jk2. \ell_{jj} = \sqrt{ a_{jj} - \sum_{k=1}^{j-1} \ell_{jk}^2 }.

For an entry below the diagonal, where i>ji > j,

ij=aijk=1j1ikjkjj. \ell_{ij} = \frac{ a_{ij} - \sum_{k=1}^{j-1} \ell_{ik}\ell_{jk} }{ \ell_{jj} }.

These formulas depend only on entries of LL that have already been computed. The square root is well-defined because positive definiteness guarantees that the quantity under the radical is positive in exact arithmetic.

77.6 A Three by Three Example

Let

A=[41216123743164398]. A = \begin{bmatrix} 4 & 12 & -16 \\ 12 & 37 & -43 \\ -16 & -43 & 98 \end{bmatrix}.

This standard example has Cholesky factor

L=[200610853]. L = \begin{bmatrix} 2 & 0 & 0 \\ 6 & 1 & 0 \\ -8 & 5 & 3 \end{bmatrix}.

Check the product:

LLT=[200610853][268015003]. LL^T = \begin{bmatrix} 2 & 0 & 0 \\ 6 & 1 & 0 \\ -8 & 5 & 3 \end{bmatrix} \begin{bmatrix} 2 & 6 & -8 \\ 0 & 1 & 5 \\ 0 & 0 & 3 \end{bmatrix}.

The entries are

(LLT)11=22=4, (LL^T)_{11} = 2^2 = 4, (LLT)12=26=12, (LL^T)_{12} = 2 \cdot 6 = 12, (LLT)13=2(8)=16, (LL^T)_{13} = 2 \cdot (-8) = -16, (LLT)22=62+12=37, (LL^T)_{22} = 6^2 + 1^2 = 37, (LLT)23=6(8)+1(5)=48+5=43, (LL^T)_{23} = 6(-8) + 1(5) = -48 + 5 = -43, (LLT)33=(8)2+52+32=64+25+9=98. (LL^T)_{33} = (-8)^2 + 5^2 + 3^2 = 64 + 25 + 9 = 98.

Thus

LLT=A. LL^T = A.

77.7 Solving Linear Systems

Suppose

Ax=b Ax = b

and

A=LLT. A = LL^T.

Then

LLTx=b. LL^T x = b.

Introduce

y=LTx. y = L^T x.

Then solve the two triangular systems

Ly=b Ly = b

and

LTx=y. L^T x = y.

The first system is solved by forward substitution. The second system is solved by backward substitution.

The solution procedure is therefore:

StepOperation
1Factor A=LLTA = LL^T
2Solve Ly=bLy = b
3Solve LTx=yL^T x = y

This is analogous to solving by LU decomposition, but the two factors are transposes of one another.

77.8 Example: Solving a System

Let

A=[46613],b=[1019]. A = \begin{bmatrix} 4 & 6 \\ 6 & 13 \end{bmatrix}, \qquad b = \begin{bmatrix} 10 \\ 19 \end{bmatrix}.

From the previous example,

A=LLT, A = LL^T,

where

L=[2032]. L = \begin{bmatrix} 2 & 0 \\ 3 & 2 \end{bmatrix}.

First solve

Ly=b. Ly = b.

That is,

[2032][y1y2]=[1019]. \begin{bmatrix} 2 & 0 \\ 3 & 2 \end{bmatrix} \begin{bmatrix} y_1 \\ y_2 \end{bmatrix} = \begin{bmatrix} 10 \\ 19 \end{bmatrix}.

The first equation gives

2y1=10, 2y_1 = 10,

so

y1=5. y_1 = 5.

The second equation gives

3y1+2y2=19. 3y_1 + 2y_2 = 19.

Substitute y1=5y_1=5:

15+2y2=19, 15 + 2y_2 = 19,

so

y2=2. y_2 = 2.

Now solve

LTx=y. L^T x = y.

That is,

[2302][x1x2]=[52]. \begin{bmatrix} 2 & 3 \\ 0 & 2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 5 \\ 2 \end{bmatrix}.

The second equation gives

2x2=2, 2x_2 = 2,

so

x2=1. x_2 = 1.

The first equation gives

2x1+3x2=5. 2x_1 + 3x_2 = 5.

Therefore

2x1+3=5, 2x_1 + 3 = 5,

so

x1=1. x_1 = 1.

Hence

x=[11]. x = \begin{bmatrix} 1 \\ 1 \end{bmatrix}.

77.9 Determinants from Cholesky

Cholesky decomposition gives a simple determinant formula.

If

A=LLT, A = LL^T,

then

det(A)=det(L)det(LT). \det(A) = \det(L)\det(L^T).

Since

det(LT)=det(L), \det(L^T)=\det(L),

we have

det(A)=det(L)2. \det(A) = \det(L)^2.

Because LL is triangular,

det(L)=i=1nii. \det(L)=\prod_{i=1}^{n}\ell_{ii}.

Therefore

det(A)=(i=1nii)2. \det(A) = \left(\prod_{i=1}^{n}\ell_{ii}\right)^2.

Equivalently,

det(A)=i=1nii2. \det(A) = \prod_{i=1}^{n}\ell_{ii}^2.

This formula is useful in statistics, optimization, Gaussian models, and numerical methods, especially when the logarithm of the determinant is needed:

logdet(A)=2i=1nlogii. \log \det(A) = 2 \sum_{i=1}^{n} \log \ell_{ii}.

77.10 Positive Definiteness Test

Cholesky decomposition can be used as a test for positive definiteness.

When the algorithm attempts to compute

jj=ajjk=1j1jk2, \ell_{jj} = \sqrt{ a_{jj} - \sum_{k=1}^{j-1} \ell_{jk}^2 },

the expression under the square root must be positive.

If, in exact arithmetic, this expression becomes zero or negative before the factorization is complete, then AA is not positive definite.

Thus Cholesky factorization is both a decomposition method and a diagnostic. For a symmetric matrix, successful Cholesky factorization with positive diagonal entries certifies positive definiteness.

In floating point arithmetic, care is needed. A nearly positive definite or ill-conditioned matrix may produce small negative values because of rounding error. In such cases, the failure may reflect numerical sensitivity rather than a clear mathematical failure.

77.11 Operation Count

For a dense n×nn \times n matrix, Cholesky decomposition requires about

13n3 \frac{1}{3}n^3

arithmetic operations.

Dense LU decomposition requires about

23n3 \frac{2}{3}n^3

operations.

Thus Cholesky is roughly twice as efficient as LU when it applies. This efficiency comes from exploiting symmetry and using only one triangular factor.

Solving the two triangular systems after factorization costs on the order of

2n2 2n^2

operations for one right-hand side.

77.12 Cholesky and Least Squares

Cholesky decomposition often appears in least squares problems through the normal equations.

Given an overdetermined system

Axb, Ax \approx b,

the least squares solution satisfies

ATAx=ATb. A^T A x = A^T b.

If AA has full column rank, then ATAA^T A is symmetric positive definite. Therefore one may compute

ATA=LLT A^T A = LL^T

and solve

LLTx=ATb. LL^T x = A^T b.

This method is simple, but it can be numerically less stable than QR decomposition because forming ATAA^T A squares the condition number. Cholesky is therefore efficient, but not always the preferred method for least squares when numerical accuracy is critical.

77.13 Cholesky and Covariance Matrices

Covariance matrices are symmetric positive semidefinite, and positive definite when the variables have no exact linear dependence.

If

Σ=LLT \Sigma = LL^T

is the Cholesky factorization of a covariance matrix, then LL can be used to transform uncorrelated standard variables into correlated variables.

Let

zN(0,I). z \sim N(0,I).

Define

x=Lz. x = Lz.

Then

Cov(x)=Cov(Lz)=LCov(z)LT=LILT=LLT=Σ. \operatorname{Cov}(x) = \operatorname{Cov}(Lz) = L \operatorname{Cov}(z) L^T = LIL^T = LL^T = \Sigma.

This is one reason Cholesky decomposition is widely used in statistics and simulation. It gives a direct way to generate random vectors with a prescribed covariance matrix.

77.14 Complex Cholesky Decomposition

For complex matrices, the proper analogue of symmetry is Hermitian symmetry.

A complex matrix AA is Hermitian if

A=A. A^* = A.

It is positive definite if

xAx>0 x^* A x > 0

for every nonzero complex vector xx.

The Cholesky factorization is

A=LL. A = LL^*.

The diagonal entries of LL are real and positive. The factor LL^* is the conjugate transpose of LL, not merely the transpose.

For real matrices, conjugation has no effect, so this reduces to

A=LLT. A = LL^T.

77.15 LDL Transpose Decomposition

A related factorization is

A=LDLT, A = LDL^T,

where LL is unit lower triangular and DD is diagonal.

This form avoids square roots. It separates the diagonal scaling into the matrix DD. For a positive definite matrix, the diagonal entries of DD are positive.

If

D=diag(d1,,dn) D = \operatorname{diag}(d_1,\ldots,d_n)

with di>0d_i>0, then

D=D1/2D1/2. D = D^{1/2}D^{1/2}.

Hence

A=LDLT=(LD1/2)(LD1/2)T. A = LDL^T = (LD^{1/2})(LD^{1/2})^T.

This recovers an ordinary Cholesky factor with

C=LD1/2. C = LD^{1/2}.

The LDLTLDL^T form is often useful in theoretical work and in numerical algorithms where square roots are undesirable.

77.16 Algorithm

The following algorithm computes the lower triangular Cholesky factor of a real symmetric positive definite matrix.

cholesky_decomposition(A):
    n = number of rows of A
    L = zero matrix of size n by n

    for j = 1 to n:
        s = 0
        for k = 1 to j - 1:
            s = s + L[j,k] * L[j,k]

        d = A[j,j] - s

        if d <= 0:
            stop: matrix is not positive definite

        L[j,j] = sqrt(d)

        for i = j + 1 to n:
            s = 0
            for k = 1 to j - 1:
                s = s + L[i,k] * L[j,k]

            L[i,j] = (A[i,j] - s) / L[j,j]

    return L

The algorithm computes one column of LL at a time. Since AA is symmetric, only the lower triangular part of AA is needed.

77.17 Storage in Practice

A dense Cholesky implementation usually stores the result in the same array that originally stored AA.

Since AA is symmetric, only one triangular half must be stored. The lower triangular half may be overwritten by LL. The upper triangular half is either unused or treated as implicit.

Stored objectPractical representation
AALower or upper triangular half
LLSame triangular half after factorization
LTL^TNot stored separately
Symmetric entriesImplied by symmetry

This storage model cuts memory use and improves cache behavior.

77.18 Failure Modes

Cholesky decomposition can fail for several reasons.

CauseMeaning
Matrix is not symmetricThe factorization A=LLTA=LL^T cannot represent it
Matrix is indefiniteSome xTAxx^T A x is negative
Matrix is singular positive semidefiniteA zero pivot appears
Matrix is ill-conditionedFloating point roundoff may create a small negative pivot
Matrix has modeling errorIntended covariance or Hessian is not positive definite

A common numerical repair is to add a small diagonal term:

Aλ=A+λI,λ>0. A_\lambda = A + \lambda I, \qquad \lambda > 0.

This operation is called diagonal regularization or jitter in statistical computing. It can make a nearly positive definite matrix safely positive definite, at the cost of changing the original problem.

77.19 Comparison with LU and QR

DecompositionMain formApplies toTypical use
LUPA=LUPA = LUGeneral square matricesLinear systems
CholeskyA=LLTA = LL^TSymmetric positive definite matricesFast SPD solves
QRA=QRA = QRRectangular or square matricesLeast squares, orthogonality
SVDA=UΣVTA = U\Sigma V^TGeneral matricesRank, conditioning, approximation

Cholesky is the preferred factorization when the matrix is known to be symmetric positive definite. LU is more general. QR is more stable for many least squares problems. SVD is the most informative but usually more expensive.

77.20 Summary

Cholesky decomposition factors a real symmetric positive definite matrix as

A=LLT. A = LL^T.

For complex Hermitian positive definite matrices, the form is

A=LL. A = LL^*.

The factor LL is lower triangular with positive diagonal entries. The factorization is unique under that convention. It is used to solve linear systems, compute determinants, test positive definiteness, generate correlated random variables, and exploit the structure of covariance and Hessian matrices.

Cholesky is faster and more memory-efficient than general LU decomposition when its assumptions are satisfied. Its limitation is also its strength: it applies only to matrices with the symmetry and positivity needed to make the triangular square-root structure valid.