11.17 Dense Graph Handling
Sparse graphs reward algorithms that touch only existing edges.
11.17 Dense Graph Handling
Problem
Sparse graphs reward algorithms that touch only existing edges. Dense graphs are different. When most vertex pairs have an edge, avoiding non-existent edges no longer helps much because there are few non-existent edges to avoid.
A dense graph has edge count close to:
V²
Examples include:
complete distance graphs
small fully connected networks
correlation graphs
similarity graphs
dense cost matrices
In these settings, representation choices change. An adjacency matrix may become practical. Heap operations may add overhead. Sorting every edge may dominate runtime.
The MST problem is the same, but the implementation strategy should change.
Solution
For dense graphs, consider:
- Prim's algorithm with an adjacency matrix.
- Compact cost matrices when all pairwise weights exist.
- Avoid sorting all edges unless you already have an edge list.
- Use
O(V²)algorithms whenEis close toV². - Treat memory layout as part of the algorithm.
The usual dense-graph default is Prim's algorithm with a simple array scan.
Why Prim Fits Dense Graphs
Kruskal sorts all edges.
In a dense graph:
E = O(V²)
So sorting costs:
O(V² log V)
Prim with an adjacency matrix can run in:
O(V²)
That removes the logarithmic sorting or heap factor.
For dense graphs, the simpler algorithm is often faster.
Adjacency Matrix Representation
Store edge weights in a matrix:
weight[u][v]
For an undirected graph:
weight[u][v] = weight[v][u]
Example:
A B C D
A 0 2 5 4
B 2 0 1 3
C 5 1 0 6
D 4 3 6 0
This representation allows constant-time edge lookup.
Matrix-Based Prim
Maintain three arrays:
visited[v]
best[v]
parent[v]
best[v] stores the cheapest known edge connecting v to the growing tree.
parent[v] stores the tree vertex that provides that cheapest connection.
At each step:
- Pick the unvisited vertex with the smallest
bestvalue. - Mark it visited.
- Scan all possible neighbors.
- Update
bestandparent.
Pseudocode
PrimDense(weight, n):
best = [infinity] * n
parent = [-1] * n
visited = [false] * n
best[0] = 0
for i in 0 .. n-1:
u = unvisited vertex with smallest best[u]
if best[u] == infinity:
return no spanning tree
visited[u] = true
for v in 0 .. n-1:
if not visited[v] and weight[u][v] < best[v]:
best[v] = weight[u][v]
parent[v] = u
return parent
The result is encoded by parent.
For every vertex except the start vertex:
(parent[v], v)
is an MST edge.
Example
Matrix:
| A | B | C | D | |
|---|---|---|---|---|
| A | 0 | 2 | 5 | 4 |
| B | 2 | 0 | 1 | 3 |
| C | 5 | 1 | 0 | 6 |
| D | 4 | 3 | 6 | 0 |
Start at A.
Initial:
best[A] = 0
best[B] = ∞
best[C] = ∞
best[D] = ∞
Visit A, update:
best[B] = 2
best[C] = 5
best[D] = 4
Visit B, update:
best[C] = 1
best[D] = 3
Visit C, no better updates.
Visit D.
Selected edges:
A-B = 2
B-C = 1
B-D = 3
Total MST cost:
6
Complexity
Matrix-based Prim performs:
V
iterations.
Each iteration scans:
V
vertices to select the next vertex and another:
V
vertices to update candidates.
Total:
O(V²)
Space:
O(V²)
This is acceptable only when V² memory is acceptable.
When Not to Use a Matrix
An adjacency matrix is poor for graphs such as:
V = 1,000,000
E = 3,000,000
because the matrix would require:
10^12
entries.
Use adjacency lists for sparse graphs.
A matrix is appropriate only when:
Vis moderate.- Most edges exist.
- Constant-time lookup is useful.
- Input is naturally a cost matrix.
Complete Graphs from Distances
Dense graphs often arise from geometric or similarity data.
Example:
each point connects to every other point
edge weight = Euclidean distance
For n points, the complete graph has:
n(n - 1) / 2
edges.
Materializing all edges may be expensive. If distances can be computed on demand, matrix-based Prim can avoid storing an explicit edge list.
Instead of reading:
weight[u][v]
from memory, compute:
distance(points[u], points[v])
during the update loop.
This trades CPU for memory.
Cost Matrix Inputs
Some problems provide a full matrix directly.
Example:
cost[i][j] = cost to connect site i to site j
In that case, matrix-based Prim is the natural solution.
No edge-list conversion is required.
Avoid transforming the matrix into all O(V²) edge objects unless you need Kruskal specifically.
Dense Graphs with Missing Edges
Some matrices use a sentinel for no edge:
infinity
-1
null
Use a clear rule:
if weight[u][v] == no_edge:
skip
Do not treat missing edges as zero-cost edges.
That mistake silently corrupts MST construction.
Kruskal on Dense Graphs
Kruskal still works on dense graphs.
It may be appropriate when:
- Edges are already sorted.
- Edge weights are small integers and radix/counting sort is available.
- You need a minimum spanning forest from an edge stream.
- The matrix is not available.
But if you must build and sort all edge objects, Kruskal often loses to dense Prim.
Borůvka on Dense Graphs
Borůvka scans all edges once per phase.
For dense graphs:
O(E log V) = O(V² log V)
It can still be useful in parallel settings, but the simple O(V²) Prim algorithm is often preferable for single-machine dense inputs.
Memory Engineering
For dense graphs, memory is the limiting factor.
If weights are integers with known bounds, use smaller types where safe:
uint16
uint32
float32
rather than defaulting to large objects.
Avoid nested object arrays if performance matters. A flat matrix can improve locality:
weight[u * n + v]
instead of:
weight[u][v]
The algorithm is unchanged, but cache behavior improves.
Symmetry Optimization
For undirected graphs, the matrix is symmetric.
You can store only the upper triangle:
u < v
This halves memory.
However, lookup becomes slightly more complex:
if u < v:
read upper[u, v]
else:
read upper[v, u]
For clarity, full matrices are often easier. For very large dense graphs, triangular storage may matter.
Disconnected Dense Graphs
Dense does not necessarily mean connected.
If missing edges are allowed, the graph may still be disconnected.
Matrix-based Prim detects this when the next selected vertex has:
best[u] = infinity
At that point, no edge connects the current tree to the remaining vertices.
Return:
no spanning tree
or compute a spanning forest if the problem permits it.
Common Mistakes
Using Heap Prim Automatically
Heap Prim is excellent for sparse adjacency lists. In dense graphs, simple array scanning can be faster and asymptotically better.
Building an Edge List from a Matrix
This creates O(V²) objects and then sorts them. Usually avoid it.
Treating Diagonal Values as Edges
Ignore weight[v][v].
Self-loops are irrelevant to MSTs.
Mishandling Missing Edges
A missing edge is not a zero-cost edge.
Ignoring Memory
An O(V²) algorithm is only viable when the matrix fits comfortably in memory.
Recipe
When handling dense MST problems:
- Check whether the input is naturally a matrix.
- Use Prim with array scans for
O(V²)time. - Store
best,parent, andvisitedarrays. - Avoid sorting all edges unless there is a specific reason.
- Treat missing edges explicitly.
- Watch memory layout and numeric types.
- Return failure if the graph is disconnected.
Dense graphs change the performance picture. In sparse graphs, we avoid touching absent edges. In dense graphs, nearly every edge exists, so the best implementation is often the direct one: maintain the cheapest connection to the current tree and scan arrays with predictable memory access.