11.24 Common Bugs

Minimum spanning tree code often fails for reasons that are easy to miss in clean textbook examples.

11.24 Common Bugs

Problem

Minimum spanning tree code often fails for reasons that are easy to miss in clean textbook examples.

The graph is disconnected. The edge list has parallel edges. The total cost overflows. The implementation forgets to store reverse edges. The priority queue contains stale entries. The union-find structure compares vertices instead of roots.

These bugs are common because MST algorithms combine several moving pieces:

graph representation
edge ordering
component tracking
tree reconstruction
cost accumulation

This section lists the most frequent failure modes and how to prevent them.

Bug: Assuming the Graph Is Connected

Many inputs are not connected.

Example:

A -- B

C -- D

There is no spanning tree for the whole graph.

Kruskal will return:

A-B
C-D

which is a forest, not a tree.

If the API expects an MST, check:

len(mstEdges) == V - 1

If this condition fails, report:

graph disconnected

or return a minimum spanning forest intentionally.

Bug: Forgetting Reverse Edges

For an undirected adjacency list, every edge must be stored twice.

Wrong:

adj[u].append((v, w))

Correct:

adj[u].append((v, w))
adj[v].append((u, w))

This bug is especially common in Prim implementations. The algorithm may appear to work on some graphs but fail when the missing direction is needed to reach a vertex.

Bug: Treating Directed Graphs as MST Inputs

Minimum spanning trees are defined for undirected graphs.

If the input is directed, you are solving a different problem. The directed analogue is usually a minimum arborescence problem, not an MST problem.

Do not silently ignore edge direction unless the problem explicitly says the graph should be treated as undirected.

Bug: Comparing Vertices Instead of Components

In Kruskal, this is wrong:

if u != v:
    accept edge

The correct check is:

if find(u) != find(v):
    accept edge

Two different vertices may already belong to the same component.

Accepting the edge would create a cycle.

Bug: Broken Union

A common incorrect union operation is:

parent[v] = u

This may attach a non-root to another non-root and corrupt the structure.

Use roots:

rootU = find(u)
rootV = find(v)

if rootU != rootV:
    parent[rootV] = rootU

With union by size or rank, attach the smaller tree under the larger one.

Bug: Naive Union-Find on Large Inputs

Naive union-find can form chains:

1 -> 2 -> 3 -> 4 -> 5

Then find() becomes slow.

Use:

path compression
union by size or rank

by default.

This is not premature optimization. It is the standard implementation.

Bug: Sorting Edges Incorrectly

Kruskal requires nondecreasing edge weight order.

Wrong sort key:

sort by source vertex

Correct sort key:

sort by weight

For deterministic output, use:

(weight, edgeId)

but never replace weight ordering with endpoint ordering.

Bug: Assuming Edge Weights Are Positive

MST algorithms work with negative weights.

Example:

A-B = -10
B-C = 2
A-C = 5

The MST uses:

A-B
B-C

If the implementation rejects negative edges, it is wrong unless the problem domain explicitly forbids them.

Bug: Accepting Self-Loops

A self-loop:

A-A

does not connect two components.

In Kruskal:

find(A) == find(A)

so it should be rejected automatically.

In Prim or graph construction code, ignore self-loops unless the surrounding API needs to preserve them for other algorithms.

Bug: Mishandling Parallel Edges

Parallel edges connect the same endpoints.

Example:

A-B = 10
A-B = 1

The cheaper edge may belong to the MST.

Do not deduplicate edges by endpoint unless you keep the minimum weight.

If edge identity matters, assign unique IDs.

Bug: Integer Overflow in Total Cost

Edge selection can be correct while the total cost is wrong.

Example:

3 edges
each weight = 1,500,000,000

Total:

4,500,000,000

This exceeds signed 32-bit integer range.

Use a wider type for accumulated cost.

A common rule:

edge weight type may be int32
total cost type should often be int64

Bug: Incorrect Early Termination

Kruskal can stop when:

acceptedEdges == V - 1

for a connected graph.

But do not stop earlier based on:

processedEdges == V - 1

Processed edges include rejected edges.

Use accepted edge count.

Bug: Missing Visited Check in Prim

Priority queues often contain stale entries.

Example:

push(10, A)
push(3, A)

When the larger old entry is popped later, it must be ignored.

Correct pattern:

cost, v = heap.pop()

if visited[v]:
    continue

visited[v] = true

Without this check, Prim may add a vertex twice and corrupt the MST.

Bug: Confusing Prim with Dijkstra

Prim chooses the cheapest edge connecting a new vertex to the tree.

Dijkstra chooses the smallest accumulated distance from the source.

Wrong for Prim:

newCost = dist[u] + weight(u, v)

Correct for Prim:

newCost = weight(u, v)

This is one of the most common implementation mistakes because the two algorithms look similar.

Bug: Parent Array Not Updated

In Prim, when a better edge is found:

best[v] = w
parent[v] = u

Both assignments matter.

If best is updated but parent is not, the final cost may be correct while the reconstructed edge list is wrong.

Bug: Treating Missing Matrix Edges as Zero

Dense graph inputs often use sentinels:

-1
infinity
null

for missing edges.

Do not treat these as zero.

A zero-weight edge is a real edge with cost zero.

A missing edge is not an edge.

Bug: Including Diagonal Matrix Entries

In an adjacency matrix:

weight[v][v]

is usually zero.

Those are self-loops.

Ignore them for MST construction.

Otherwise the algorithm may repeatedly select meaningless zero-cost edges.

Bug: Not Handling Single-Vertex Graphs

For:

V = 1
E = 0

the MST has:

0 edges
0 total cost

Do not return disconnected or error unless the problem explicitly disallows empty trees.

Bug: Failing on Empty Graphs

For:

V = 0

the correct behavior is API-dependent.

Some libraries return an empty forest.

Some reject the input.

Choose one behavior and document it.

Do not allow undefined behavior.

Bug: Ignoring Equal-Weight MSTs

Multiple MSTs may exist.

If tests require one exact edge set, they may fail even when the algorithm is correct.

For equal-weight graphs, test:

valid spanning tree
correct total cost

rather than exact edge identity.

Bug: Broken Borůvka Phases

In Borůvka, each component should select its cheapest outgoing edge based on the component structure at the start of the phase.

A common mistake is to merge components while still computing cheapest edges for the same phase.

Safer pattern:

scan all edges and record cheapest outgoing edge per component
then perform unions

Keep these two steps separate.

Bug: Rollback DSU with Path Compression

Rollback union-find needs to undo changes.

Path compression changes many parent pointers during find().

If those changes are not recorded, rollback becomes incorrect.

For rollback DSU, usually use:

union by size
no path compression

This gives predictable undo behavior.

Bug: Not Verifying Output

Even if the algorithm seems correct, validate the result.

For connected graphs:

edges = V - 1
no cycles
all vertices connected

For disconnected graphs:

forest has V - componentCount edges
no cycles
components match original graph connectivity

Validation catches many defects before they reach production.

Debugging Checklist

When MST output is wrong, inspect:

input edges
sorted edge order
accepted edges
rejected edges
union-find roots
visited array
parent array
total cost type
connectivity result

Do not debug only from the final cost.

The final cost is usually too little information.

Common Bug Table

Symptom Likely Cause
Too many edges returned Cycle detection broken
Too few edges returned Graph disconnected or traversal bug
Cost too small Self-loop, missing-edge, or overflow bug
Cost too large Sort order or update bug
Prim misses vertices Missing reverse edges or disconnected graph
Kruskal accepts cycle Bad union-find
Different valid MST edges Equal weights, not necessarily a bug
Crash on small graph Single-vertex or empty-graph handling

Recipe

Use this checklist when hardening MST code:

  1. Verify graph direction assumptions.
  2. Store undirected edges correctly.
  3. Sort by weight for Kruskal.
  4. Use optimized union-find.
  5. Ignore self-loops.
  6. Preserve parallel edges or safely reduce them.
  7. Use a wide type for total cost.
  8. Check connectivity explicitly.
  9. Handle equal weights in tests.
  10. Validate the output structure.

Most MST bugs are preventable. The safest implementations make assumptions visible: connected or disconnected, directed or undirected, tree or forest, exact edge set or just cost. Once those contracts are explicit, the algorithms are straightforward to test and maintain.