Chapter 137. Compressed Sensing

137.1 Introduction

Compressed sensing is a method for recovering sparse or compressible signals from fewer linear measurements than ordinary sampling theory might suggest.

The central idea is simple. If a signal has only a small amount of essential information, then it may be possible to recover it from a small number of carefully chosen measurements. The recovery problem is underdetermined, but sparsity makes the problem solvable.

Compressed sensing studies systems of the form

y = Ax,

where:

Symbol	Meaning
$x$	Unknown signal
$A$	Measurement or sensing matrix
$y$	Observed measurements

Usually,

A\in\mathbb{R}^{m\times n}

with

m<n.

Thus there are fewer equations than unknowns. Without additional structure, the system has many solutions. Compressed sensing uses sparsity to select the correct one. This is why the subject belongs naturally to linear algebra, optimization, probability, and signal processing. Compressed sensing is commonly described as recovery from underdetermined linear systems by exploiting sparsity and incoherence.

137.2 Sparse Signals

A vector

x\in\mathbb{R}^n

is called $k$ -sparse if it has at most $k$ nonzero entries.

The number of nonzero entries is written

\|x\|_0.

Thus $x$ is $k$ -sparse if

\|x\|_0\le k.

The notation $\|\cdot\|_0$ is common but slightly misleading. It is not a norm. It counts nonzero coordinates.

For example,

x = \begin{bmatrix} 0\\ 3\\ 0\\ 0\\ -2\\ 0 \end{bmatrix}

is $2$ -sparse.

Sparse vectors appear when most coordinates carry no information. In applications, many signals are not sparse in their original coordinates but become sparse after a change of basis.

137.3 Compressible Signals

A signal is compressible if it is well approximated by a sparse signal.

For example, a vector may have many nonzero entries, but most of them may be small. Keeping only the largest $k$ entries gives a sparse approximation.

Let $x_k$ denote the best $k$ -sparse approximation to $x$ . Then the approximation error is

\|x-x_k\|.

Compressed sensing is useful when this error is small.

Compressibility is important because real data is rarely exactly sparse. Images, audio, physical signals, and scientific data usually have many small coefficients rather than exactly zero coefficients.

137.4 Sparse Bases

A signal may be sparse in one basis and dense in another.

Let

\Psi

be a basis matrix. A signal $x$ can be written as

x=\Psi s,

where $s$ is the coefficient vector.

If $s$ is sparse, then $x$ is sparse in the basis $\Psi$ .

Common sparsifying systems include:

System	Typical signal type
Fourier basis	Periodic or oscillatory signals
Wavelet basis	Images and piecewise smooth signals
Discrete cosine basis	Image compression
Gradient basis	Piecewise constant signals
Learned dictionaries	Data-adaptive sparse models

Thus compressed sensing often recovers sparse coefficients rather than raw signal coordinates.

137.5 Linear Measurements

Compressed sensing takes measurements of the form

y_i = \langle a_i,x\rangle,

where $a_i$ is the $i$ -th sensing vector.

Collecting all measurements gives

y=Ax.

Each row of $A$ defines one linear measurement.

If $m<n$ , the system is underdetermined. The null space of $A$ has positive dimension, so many vectors satisfy the same measurement equation.

The goal is not to recover every vector. The goal is to recover sparse vectors.

137.6 The Recovery Problem

The ideal sparse recovery problem is:

\min_z \|z\|_0

subject to

Az=y.

This asks for the sparsest vector consistent with the measurements.

However, $\ell_0$ -minimization is computationally difficult in general. The problem is combinatorial because it searches over possible support sets.

A common relaxation replaces $\|z\|_0$ with the $\ell^1$ -norm:

\min_z \|z\|_1

subject to

Az=y.

Here

\|z\|_1 = |z_1|+\cdots+|z_n|.

The $\ell^1$ -problem is convex and can be solved by standard optimization methods. The use of $\ell^1$ -relaxation is a central technique in compressed sensing.

137.7 Basis Pursuit

The problem

\min_z \|z\|_1 \quad \text{subject to} \quad Az=y

is called basis pursuit.

It is the standard noiseless recovery method.

The geometric reason basis pursuit works is that the $\ell^1$ -ball has corners aligned with coordinate axes. Sparse vectors lie on these corners. When the affine space

\{z:Az=y\}

intersects the $\ell^1$ -ball, the first point of contact is often sparse.

This is a linear-algebraic and convex-geometric phenomenon.

137.8 Noisy Measurements

In practice, measurements contain noise:

y = Ax + e,

where $e$ is an error vector.

The recovery problem becomes:

\min_z \|z\|_1

subject to

\|Az-y\|_2\le \varepsilon.

This is called basis pursuit denoising.

The constraint allows the reconstructed signal to fit the measurements up to the expected noise level.

Compressed sensing theory distinguishes two properties:

Property	Meaning
Exact recovery	Perfect recovery in the noiseless sparse case
Stable recovery	Small noise gives small reconstruction error

Stable recovery is essential for real applications. Standard compressed sensing theory treats both noisy measurements and signals that are only approximately sparse.

137.9 The Null Space View

The null space of $A$ determines whether sparse recovery is possible.

Suppose

Ax=y.

If $h\in\ker A$ , then

A(x+h)=Ax.

Thus $x$ and $x+h$ produce the same measurements.

For recovery to succeed, no nonzero vector in the null space should allow one sparse vector to be confused with another sparse vector.

The null space property gives a necessary and sufficient condition for exact sparse recovery by $\ell^1$ -minimization. It is important theoretically, although it may be difficult to verify directly.

137.10 Restricted Isometry Property

The restricted isometry property, or RIP, is one of the main sufficient conditions for compressed sensing.

A matrix $A$ satisfies the restricted isometry property of order $k$ if there exists a number $\delta_k\in(0,1)$ such that

(1-\delta_k)\|x\|_2^2 \le \|Ax\|_2^2 \le (1+\delta_k)\|x\|_2^2

for every $k$ -sparse vector $x$ .

This says that $A$ approximately preserves the lengths of all $k$ -sparse vectors.

If $A$ preserves distances between sparse vectors, then sparse vectors remain distinguishable after measurement.

RIP connects compressed sensing with geometry. The sensing matrix acts almost like an isometry on sparse subspaces.

137.11 Incoherence

Incoherence measures how poorly aligned the measurement system is with the sparsifying system.

If the signal is sparse in basis $\Psi$ , and measurements are taken in another system $\Phi$ , then compressed sensing works best when $\Phi$ and $\Psi$ are incoherent.

Informally, each measurement should see many coordinates of the sparse representation rather than only one.

Random measurements are often incoherent with fixed sparse bases.

This explains why random sensing matrices appear frequently in compressed sensing theory.

137.12 Random Sensing Matrices

Random matrices are common sensing matrices.

Examples include:

Matrix type	Entries
Gaussian	Independent normal entries
Bernoulli	Independent $\pm1$ entries
Partial Fourier	Randomly selected Fourier rows
Subsampled orthogonal	Random rows of orthogonal transforms

Random matrices often satisfy RIP with high probability when

m

is proportional to

k\log(n/k).

This means that the number of measurements depends mainly on sparsity $k$ , not ambient dimension $n$ .

Many random designs give near-optimal sample complexity for sparse recovery.

137.13 Measurement Complexity

A guiding estimate in compressed sensing is:

m \gtrsim k\log(n/k).

Here:

Symbol	Meaning
$m$	Number of measurements
$k$	Sparsity level
$n$	Ambient dimension

The logarithmic factor appears because the support of a sparse vector is unknown.

There are

\binom{n}{k}

possible support sets.

The measurements must identify both the values on the support and the support itself.

This estimate explains the power of compressed sensing. When $k\ll n$ , far fewer than $n$ measurements may be enough.

137.14 Support Recovery

The support of a vector is the set of indices where it is nonzero:

\operatorname{supp}(x)=\{i:x_i\neq0\}.

Recovering a sparse vector involves two tasks:

Identify the support,
Estimate the nonzero values.

If the support were known, recovery would reduce to an ordinary least squares problem using the corresponding columns of $A$ .

The difficulty is that the support is unknown.

Many algorithms can be understood as different strategies for finding or approximating the support.

137.15 Greedy Algorithms

Greedy algorithms recover sparse signals by selecting columns of $A$ step by step.

Important examples include:

Algorithm	Main idea
Matching Pursuit	Select one correlated atom at a time
Orthogonal Matching Pursuit	Refit after each selected atom
CoSaMP	Select multiple candidates and prune
Iterative Hard Thresholding	Gradient step followed by sparsification

Greedy methods are often faster than convex optimization methods.

They are useful when large-scale computation is more important than solving a convex program exactly.

137.16 Orthogonal Matching Pursuit

Orthogonal Matching Pursuit, or OMP, builds a support set iteratively.

At each step:

Compute the residual

r=y-Ax_k.

Find the column of $A$ most correlated with $r$ .
Add that column to the active set.
Solve a least squares problem on the active set.
Update the residual.

The algorithm stops when the residual is small or when a desired sparsity level is reached.

OMP is simple, interpretable, and often effective.

137.17 Lasso

The Lasso solves the penalized problem

\min_z \frac12\|Az-y\|_2^2+\lambda\|z\|_1.

The first term measures data fit. The second term promotes sparsity.

The parameter

\lambda>0

controls the tradeoff.

Large $\lambda$ produces sparser solutions. Small $\lambda$ fits the data more closely.

Lasso is widely used in statistics and machine learning.

It is closely related to basis pursuit denoising.

137.18 Geometry of $\ell^1$ -Minimization

The $\ell^1$ -norm promotes sparsity because its unit ball has sharp corners.

In two dimensions, the $\ell^1$ -unit ball is a diamond:

|x_1|+|x_2|\le1.

In higher dimensions, the corners correspond to sparse coordinate patterns.

When minimizing $\ell^1$ over an affine space, the solution often occurs at a low-dimensional face of the $\ell^1$ -ball. Such faces correspond to sparse vectors.

This geometric fact explains why a convex norm can recover sparse solutions.

137.19 Dictionaries

A dictionary is a collection of atoms used to represent signals.

Let

D\in\mathbb{R}^{n\times p}

be a dictionary matrix. A signal is represented as

x=D\alpha.

The coefficient vector $\alpha$ may be sparse even when $D$ has more columns than rows.

p>n,

the dictionary is overcomplete.

Compressed sensing with dictionaries solves

y=AD\alpha

and attempts to recover sparse $\alpha$ .

Dictionaries may be fixed, such as wavelets, or learned from data.

137.20 Applications

Compressed sensing appears in many areas.

Application	Role of compressed sensing
MRI	Fewer measurements during acquisition
Imaging	Reconstruction from incomplete samples
Radar	Sparse target recovery
Seismology	Sparse reflection recovery
Communications	Channel estimation
Astronomy	Image reconstruction
Machine learning	Sparse feature selection
Sensor networks	Reduced measurement and transmission

MRI is one of the standard application areas, since many medical images are sparse or compressible in transform domains and the measurements are naturally related to Fourier samples.

137.21 Summary

Compressed sensing studies recovery of sparse signals from undersampled linear measurements.

The main concepts are:

Concept	Meaning
Sparse signal	Vector with few nonzero entries
Compressible signal	Well approximated by a sparse vector
Sensing matrix	Matrix defining linear measurements
Basis pursuit	$\ell^1$ -minimization for noiseless recovery
Basis pursuit denoising	Recovery with measurement noise
Null space property	Exact recovery condition
Restricted isometry property	Approximate isometry on sparse vectors
Incoherence	Low alignment between sensing and sparsifying systems
Random sensing	Random measurements for uniform recovery
Greedy recovery	Iterative support selection
Lasso	Penalized sparse recovery
Dictionary	Sparse representation system

Compressed sensing shows that linear systems with fewer equations than unknowns can still be solvable when the unknown vector has sparse structure. It combines linear algebra, convex geometry, probability, and optimization into a theory of efficient acquisition and reconstruction.

Chapter 137. Compressed Sensing

137.1 Introduction

137.2 Sparse Signals

137.3 Compressible Signals

137.4 Sparse Bases

137.5 Linear Measurements

137.6 The Recovery Problem

137.7 Basis Pursuit

137.8 Noisy Measurements

137.9 The Null Space View

137.10 Restricted Isometry Property

137.11 Incoherence

137.12 Random Sensing Matrices

137.13 Measurement Complexity

137.14 Support Recovery

137.15 Greedy Algorithms

137.16 Orthogonal Matching Pursuit

137.17 Lasso

137.18 Geometry of ℓ1\ell^1ℓ1-Minimization

137.19 Dictionaries

137.20 Applications

137.21 Summary

137.18 Geometry of $\ell^1$ -Minimization