138.1 Introduction
Tensor decompositions generalize matrix decompositions to multidimensional arrays.
A matrix has two modes: rows and columns. A tensor may have three, four, or more modes. For example, an image dataset may have modes for height, width, color channel, and image index. A recommender system may have modes for user, item, and time. A knowledge graph may have modes for subject, relation, and object.
Tensor decompositions express a large tensor using smaller factors. They are used for compression, denoising, latent factor modeling, feature extraction, and multidimensional data analysis. The two classical models are CP decomposition and Tucker decomposition, often viewed as higher-order analogues of matrix factorization methods such as SVD and PCA.
138.2 Tensors
A tensor is a multidimensional array.
A vector is a first-order tensor:
A matrix is a second-order tensor:
A third-order tensor has entries
More generally, an -th order tensor has entries
The number of indices is called the order of the tensor. Each index direction is called a mode.
For example, if
then is a third-order tensor with three modes of sizes , , and .
138.3 Why Tensors Are Different from Matrices
Matrices already support many powerful decompositions:
| Matrix decomposition | Purpose |
|---|---|
| LU | Solving linear systems |
| QR | Orthogonalization and least squares |
| Eigendecomposition | Spectral analysis |
| SVD | Low-rank approximation |
Tensors do not inherit all matrix properties directly.
For matrices, rank is well behaved. Every matrix has a best rank- approximation under the Frobenius norm, given by truncated SVD.
For tensors, rank is more complicated. A best low-rank approximation may fail to exist in some models. Rank can be hard to compute. Decompositions may be unique in some cases and ill-conditioned in others.
Thus tensor decompositions are not merely matrix decompositions with more indices. They form a distinct theory.
138.4 Fibers and Slices
A fiber is obtained by fixing all tensor indices except one.
For a third-order tensor
examples include:
| Fiber type | Fixed indices | Varying index |
|---|---|---|
| Mode-1 fiber | fixed | |
| Mode-2 fiber | fixed | |
| Mode-3 fiber | fixed |
A slice is obtained by fixing one index.
For a third-order tensor:
| Slice type | Fixed index | Result |
|---|---|---|
| Frontal slice | fixed | Matrix in |
| Lateral slice | fixed | Matrix in |
| Horizontal slice | fixed | Matrix in |
Fibers generalize rows and columns. Slices generalize matrix subarrays.
138.5 Tensor Unfolding
Tensor unfolding, also called matricization or flattening, converts a tensor into a matrix.
For a third-order tensor
the mode-1 unfolding is a matrix
It arranges all mode-1 fibers as columns.
Similarly:
and
Unfolding is important because many tensor algorithms reduce parts of the problem to matrix operations. It allows SVD, least squares, and rank computations to be applied mode by mode.
138.6 Mode Products
The mode- product multiplies a tensor by a matrix along one mode.
Let
and let
The mode- product is written
The resulting tensor has size
In coordinates,
Mode products are the basic algebraic operation behind Tucker decompositions and multilinear transformations.
138.7 Rank-One Tensors
A rank-one tensor is an outer product of vectors.
For vectors
their outer product is the tensor
with entries
This generalizes the rank-one matrix
Rank-one tensors are the building blocks of CP decomposition.
138.8 CP Decomposition
The CP decomposition writes a tensor as a sum of rank-one tensors.
For a third-order tensor,
Here:
| Symbol | Meaning |
|---|---|
| Number of components | |
| Component weight | |
| Factor vectors |
The CP model is also called CANDECOMP/PARAFAC.
In coordinates,
The factor vectors are often collected into factor matrices:
CP decomposition is useful when one wants a small number of latent components that jointly explain all modes.
138.9 Tensor Rank
The CP rank of a tensor is the smallest number such that
This generalizes matrix rank.
However, tensor rank is much more difficult than matrix rank.
For matrices, rank can be computed by Gaussian elimination or SVD. For tensors, determining rank is generally hard. Tensor rank also depends more subtly on the scalar field. A tensor may have different rank over and over .
The complexity of tensor rank is one reason tensor decomposition theory is technically difficult.
138.10 CP Uniqueness
CP decomposition can be unique under conditions where matrix factorization would not be.
This is one of the attractive features of CP models. A matrix factorization
has a rotational ambiguity:
for any invertible matrix .
CP decompositions can avoid this ambiguity because three or more modes constrain the components more strongly.
Uniqueness is usually meant up to:
- Permutation of components,
- Rescaling of factors within each component.
For example,
whenever
138.11 Tucker Decomposition
The Tucker decomposition writes a tensor as a smaller core tensor multiplied by factor matrices along each mode.
For a third-order tensor,
Here:
| Symbol | Meaning |
|---|---|
| Core tensor | |
| Factor matrices | |
| Mode- product |
In coordinates,
The Tucker model separates mode-specific bases from the interactions among components. The core tensor records how components from different modes interact. Tucker decomposition is commonly described as a multilinear generalization of PCA, with a core tensor and factor matrices for the modes.
138.12 Tucker Rank
The Tucker rank of a tensor is the tuple of ranks of its unfoldings.
For a third-order tensor,
where
Unlike CP rank, Tucker rank is a tuple rather than a single number.
This reflects the fact that each mode may have a different intrinsic dimension.
For example, an image dataset may require many spatial components but fewer lighting or viewpoint components.
138.13 CP versus Tucker
CP and Tucker decompositions serve different purposes.
| Feature | CP decomposition | Tucker decomposition |
|---|---|---|
| Basic form | Sum of rank-one tensors | Core tensor times factor matrices |
| Rank parameter | Single integer | Tuple |
| Core tensor | Diagonal or implicit | Full smaller tensor |
| Interpretability | Components often directly interpretable | Factors plus interactions |
| Flexibility | More constrained | More flexible |
| Compression | Strong when CP rank is small | Strong when multilinear rank is small |
CP is often preferred when one wants identifiable latent components. Tucker is often preferred when one wants compression, dimensional reduction, or mode-wise bases.
138.14 Higher-Order SVD
The higher-order singular value decomposition, or HOSVD, is a Tucker-type decomposition computed from SVDs of tensor unfoldings.
For each mode , compute an SVD of
Use the leading left singular vectors as the columns of the factor matrix
Then define the core by projecting the tensor onto these mode bases:
The approximation is then
HOSVD is a direct and stable way to compute Tucker approximations.
138.15 Higher-Order Orthogonal Iteration
Higher-order orthogonal iteration, or HOOI, improves a Tucker approximation iteratively.
The method alternates over modes. For each mode, it fixes all other factor matrices and updates the current factor matrix using an SVD of a projected unfolding.
This is analogous to alternating least squares.
HOOI usually starts from HOSVD and then refines the factor matrices.
It is often more accurate than plain HOSVD for a fixed multilinear rank, but it is iterative and may converge to a local optimum.
138.16 Alternating Least Squares
Alternating least squares, or ALS, is a standard method for computing tensor decompositions.
For CP decomposition, ALS fixes all factor matrices except one, then solves a least squares problem for the remaining factor matrix. It cycles through the modes until convergence.
For a third-order CP model:
- Fix and , solve for ,
- Fix and , solve for ,
- Fix and , solve for ,
- Repeat.
ALS is simple and widely used, but it can be slow, sensitive to initialization, and affected by degeneracy.
Recent algorithms also use randomized sketching to reduce the cost of least squares subproblems for large tensors.
138.17 Tensor Train Decomposition
Tensor train decomposition represents a high-order tensor as a chain of smaller core tensors.
For an -th order tensor,
where the product is matrix multiplication over hidden rank indices.
More explicitly, each core has size
with boundary ranks
Tensor trains are useful for very high-order tensors, where CP and Tucker may become too large.
Tensor train methods are implemented in modern tensor libraries, including SVD-based and cross-approximation variants.
138.18 Hierarchical Tucker Decomposition
Hierarchical Tucker decomposition organizes modes in a tree.
Instead of using one large core tensor as in Tucker decomposition, it recursively groups modes and represents interactions through smaller transfer tensors.
This avoids the exponential growth of the Tucker core in high order.
Hierarchical tensor formats are important in scientific computing, numerical PDEs, quantum many-body simulation, and high-dimensional approximation.
They trade a single global core for a hierarchy of smaller cores.
138.19 Nonnegative Tensor Factorization
Nonnegative tensor factorization imposes nonnegativity constraints on the factors.
If
one may seek a decomposition with
This is useful when the data represents counts, intensities, probabilities, or other nonnegative quantities.
Examples include:
| Data type | Interpretation |
|---|---|
| Documents by words by time | Topics over time |
| Users by items by contexts | Recommender patterns |
| Images by pixels by conditions | Parts-based features |
| Brain signals | Nonnegative activations |
Nonnegativity often improves interpretability.
138.20 Symmetric Tensor Decomposition
A symmetric tensor is invariant under permutations of its indices.
For example,
for all index permutations.
A symmetric CP decomposition has the form
Symmetric tensors arise from homogeneous polynomials, moments, cumulants, and higher-order derivatives.
For example, a cubic form
corresponds to a symmetric third-order tensor.
138.21 Tensor Completion
Tensor completion estimates missing entries of a tensor.
Suppose only some entries of
are observed. The goal is to recover the full tensor by assuming low-rank structure.
This generalizes matrix completion.
Applications include:
| Application | Missing entries |
|---|---|
| Recommender systems | User-item-time ratings |
| Sensor networks | Missing measurements |
| Image and video | Damaged pixels or frames |
| Medical data | Incomplete patient measurements |
| Knowledge graphs | Missing relations |
Tensor completion uses the idea that multidimensional structure can constrain missing values more strongly than matrix structure alone.
138.22 Applications
Tensor decompositions appear in many areas.
| Area | Tensor modes |
|---|---|
| Chemometrics | Sample, wavelength, time |
| Recommender systems | User, item, context |
| Computer vision | Image, illumination, pose, identity |
| Neuroscience | Channel, time, trial |
| Natural language processing | Word, context, document |
| Knowledge graphs | Subject, relation, object |
| Scientific computing | Spatial and parameter dimensions |
The key advantage is that tensors preserve multiway structure. Flattening a tensor into a matrix can destroy relationships among modes.
138.23 Numerical Issues
Tensor decomposition algorithms face several numerical issues.
| Issue | Description |
|---|---|
| Initialization | Different starts can give different solutions |
| Local minima | Objective functions are nonconvex |
| Degeneracy | Components may diverge while approximation improves |
| Scaling ambiguity | Factors can be rescaled without changing the tensor |
| Permutation ambiguity | Components can be reordered |
| Rank selection | Correct rank may be unknown |
| Missing data | Observed entries may be sparse or biased |
These issues make diagnostics important. One usually checks reconstruction error, stability across runs, factor interpretability, and sensitivity to rank.
138.24 Relation to Linear Algebra
Tensor decompositions extend matrix decompositions but do not simply copy them.
| Matrix concept | Tensor analogue |
|---|---|
| Matrix rank | CP rank, Tucker rank |
| SVD | HOSVD, Tucker decomposition |
| Rank-one matrix | Rank-one tensor |
| Low-rank approximation | CP, Tucker, tensor train approximation |
| Matrix completion | Tensor completion |
| PCA | Multilinear PCA |
| Matrix factorization | Multiway factorization |
The central new feature is interaction among more than two modes. This creates richer models and more difficult mathematics.
138.25 Summary
Tensor decompositions express multidimensional arrays using smaller structured factors.
The main concepts are:
| Concept | Meaning |
|---|---|
| Tensor | Multidimensional array |
| Mode | Index direction |
| Fiber | One-dimensional slice |
| Slice | Lower-dimensional section |
| Unfolding | Matrix view of a tensor |
| Mode product | Multiplication along one mode |
| Rank-one tensor | Outer product of vectors |
| CP decomposition | Sum of rank-one tensors |
| Tucker decomposition | Core tensor times factor matrices |
| Tucker rank | Rank tuple of tensor unfoldings |
| HOSVD | SVD-based Tucker decomposition |
| ALS | Alternating least squares fitting |
| Tensor train | Chain of small tensor cores |
| Tensor completion | Recovery from missing entries |
Tensor decompositions provide the multilinear extension of matrix factorization. They preserve multiway structure and give tools for compression, latent modeling, denoising, completion, and high-dimensional numerical computation.