11.22 Testing MST Code

Minimum spanning tree implementations can look correct while failing on small edge cases.

11.22 Testing MST Code

Problem

Minimum spanning tree implementations can look correct while failing on small edge cases. The usual failures are not in the theorem. They are in details such as input handling, duplicate edges, disconnected graphs, integer overflow, and incorrect component updates.

A good test suite should verify more than the happy path.

It should answer:

Does the algorithm return a tree?
Does it have minimum cost?
Does it handle disconnected graphs?
Does it behave correctly with equal weights?
Does it match another independent implementation?

This section gives a practical testing strategy for MST code.

Solution

Test MST implementations at three levels:

  1. Structural tests
  2. Cost tests
  3. Cross-check tests

Structural tests verify that the output is a valid spanning tree or forest.

Cost tests verify that the total weight is correct for known graphs.

Cross-check tests compare independent algorithms, such as Kruskal and Prim, on generated graphs.

Structural Validation

For a connected graph with V vertices, a valid MST must contain:

V - 1

edges.

It must also be:

connected
acyclic

A simple validation function can use union-find.

ValidateTree(V, mstEdges):

    if len(mstEdges) != V - 1:
        return false

    dsu = UnionFind(V)

    for edge(u, v, w) in mstEdges:

        if dsu.connected(u, v):
            return false

        dsu.union(u, v)

    return true

This catches cycles and wrong edge counts.

Validating a Forest

For disconnected graphs, the output should be a minimum spanning forest.

If the graph has C connected components, the forest should contain:

V - C

edges.

You can compute C from the original graph using DFS, BFS, or union-find.

expected_edges = V - component_count

Then verify that the returned forest has exactly that many edges and contains no cycles.

Known Small Tests

Start with graphs where the answer is obvious.

Single Vertex

Input:

V = 1
E = 0

Expected:

cost = 0
edges = []

Two Vertices

Input:

A-B = 7

Expected:

cost = 7
edges = [A-B]

Triangle

Input:

A-B = 1
B-C = 2
A-C = 10

Expected:

cost = 3

The MST uses:

A-B
B-C

Square with Diagonal

Input:

A-B = 1
B-C = 2
C-D = 3
D-A = 4
A-C = 5

Expected:

cost = 6

The MST uses:

A-B
B-C
C-D

Equal Weights

Equal weights test determinism and correctness.

Input:

A-B = 1
B-C = 1
A-C = 1

Expected:

cost = 2

Any two edges are valid.

Do not assert a specific edge set unless your implementation intentionally uses deterministic tie-breaking.

Assert cost and structural validity instead.

Parallel Edges

Parallel edges are a common source of bugs.

Input:

A-B = 10
A-B = 1
B-C = 2

Expected:

cost = 3

The MST must use the cheaper A-B edge.

This test requires edge IDs if the implementation tracks accepted edges.

Negative Weights

MST algorithms support negative weights.

Input:

A-B = -5
B-C = 2
A-C = 10

Expected:

cost = -3

If negative edges break the implementation, the problem is likely in sorting or assumptions imported from shortest-path code.

Disconnected Graph

Input:

A-B = 1

C-D = 2

Expected forest:

A-B
C-D

Expected total cost:

3

If the API promises an MST only for connected graphs, expected output may instead be:

error: graph disconnected

Choose one behavior and test it explicitly.

Self-Loops

Self-loops must be ignored.

Input:

A-A = -100
A-B = 5

Expected:

cost = 5

A self-loop never helps connect two different vertices, even when its weight is negative.

Duplicate Weights with Different Edges

Input:

A-B = 2
B-C = 2
C-D = 2
A-D = 2

Expected:

cost = 6

Any three edges that connect all four vertices are valid.

This checks that the implementation does not accidentally reject equal-weight edges globally.

Overflow Tests

Use weights large enough to exceed 32-bit integer totals.

Example:

V = 4
A-B = 1_500_000_000
B-C = 1_500_000_000
C-D = 1_500_000_000

Expected cost:

4_500_000_000

If the implementation returns a negative number or truncated value, use a wider integer type for total cost.

Randomized Cross-Checking

A strong test strategy is to implement two MST algorithms independently.

For each generated graph:

  1. Run Kruskal.
  2. Run Prim.
  3. Compare total costs.
  4. Validate both outputs structurally.

Example:

for seed in 1..1000:

    G = randomGraph(seed)

    mst1 = Kruskal(G)
    mst2 = Prim(G)

    assert cost(mst1) == cost(mst2)
    assert valid(mst1)
    assert valid(mst2)

This finds many bugs quickly.

Brute Force for Tiny Graphs

For very small graphs, you can compute the true MST by enumerating edge subsets.

If V = 5, a spanning tree has:

4

edges.

Try every subset of four edges, keep the connected acyclic ones, and compute the minimum cost.

This is too slow for large graphs but excellent for testing.

Brute-Force Validator

BruteForceMST(G):

    best = infinity

    for each subset S of edges with size V - 1:

        if S is a spanning tree:

            best = min(best, cost(S))

    return best

Use this only for tiny graphs.

Then compare:

algorithm_cost == brute_force_cost

Property-Based Tests

Instead of testing only fixed examples, generate many random cases and check invariants.

Useful properties:

Property Check
Edge count V - 1 for connected graph
Acyclic no edge connects already-connected vertices
Connected all vertices share one component
Cost agreement Kruskal cost equals Prim cost
Cut property spot check selected edges are safe under known cuts
Forest size V - C for disconnected graph

Property-based testing is especially useful for graph algorithms.

Test Graph Families

Use different graph shapes.

Path Graph

0-1-2-3-4

The MST is the graph itself.

Cycle Graph

0-1-2-3-0

The MST removes one edge.

Complete Graph

Every vertex pair has an edge.

Useful for dense behavior.

Star Graph

0 connected to all vertices

Tests hub-heavy topology.

Grid Graph

Tests structured sparse graphs.

Random Sparse Graph

Tests general sparse cases.

Random Dense Graph

Tests matrix or heavy-edge-list behavior.

Testing Prim

For Prim specifically:

  • Test starting from different vertices.
  • Test disconnected graphs.
  • Test duplicate heap entries.
  • Verify visited checks.
  • Verify parent array reconstruction.

The MST cost should not depend on the start vertex.

The actual tree may differ.

Testing Kruskal

For Kruskal:

  • Test edge sorting.
  • Test parallel edges.
  • Test union-find behavior.
  • Test early termination.
  • Test disconnected graphs.

A useful assertion:

accepted_edges <= V - 1

for connected graph mode.

If more edges are accepted, cycle detection is broken.

Testing Borůvka

For Borůvka:

  • Test duplicate selected edges.
  • Test components choosing the same outgoing edge.
  • Test equal weights.
  • Test disconnected graphs.
  • Verify component count decreases correctly.

Borůvka bugs often involve merging components during a phase and corrupting choices made earlier in that same phase.

Deterministic Tests

Random tests should use fixed seeds.

Example:

seed = 12345

When a failure occurs, print the seed and graph.

This makes bugs reproducible.

Debug Output

When a test fails, useful output includes:

vertices
edges sorted by weight
accepted edges
rejected edges
component roots
total cost

Do not print only:

expected != actual

Graph bugs need context.

Common Mistakes

Asserting Exact Edges with Equal Weights

Multiple MSTs may exist. Assert cost and validity instead.

Forgetting Disconnected Cases

Real sparse graphs are often disconnected.

Ignoring Self-Loops

Self-loops should not enter the MST.

Skipping Overflow Tests

Cost accumulation can fail even when edge selection is correct.

Trusting One Algorithm

Cross-checking independent implementations is stronger.

Recipe

Use this testing checklist for MST code:

  1. Test tiny known graphs.
  2. Test equal weights, negative weights, self-loops, and parallel edges.
  3. Test disconnected graphs according to the API contract.
  4. Validate tree or forest structure.
  5. Compare Kruskal and Prim on random graphs.
  6. Use brute force for very small graphs.
  7. Use fixed random seeds.
  8. Print enough graph state to reproduce failures.
  9. Include overflow tests for total cost.

MST testing should verify both structure and optimality. A low cost is meaningless if the output is disconnected. A connected tree is insufficient if a cheaper tree exists. Strong tests cover both sides.