10.7 Computing All-Pairs Shortest Paths with Floyd-Warshall

The algorithms covered so far solve the single-source shortest path problem. Given a source vertex, they compute the shortest distance to every other vertex.

Sometimes that is not enough.

A routing system may need distances between every pair of cities. A graph analytics engine may need the shortest distance between every pair of users. A database optimizer may repeatedly ask shortest-path questions between arbitrary vertices.

Running Dijkstra once for every vertex is often possible, but there is another approach.

The Floyd-Warshall algorithm computes the shortest distance between every pair of vertices simultaneously.

Unlike Dijkstra and Bellman-Ford, Floyd-Warshall is based on dynamic programming rather than edge relaxation. It is remarkably compact, handles negative edge weights, and provides one of the cleanest examples of dynamic programming on graphs.

Problem

Given a weighted graph, compute the shortest distance between every pair of vertices.

Consider the graph:

A --3--> B
|        |
8        2
|        |
v        v
C --1--> D

We want to know:

A -> B
A -> C
A -> D
B -> A
B -> C
...

for every possible pair of vertices.

Solution

Represent the graph as a distance matrix.

Initially:

distance[i][j]

contains:

Edge weight if an edge exists
Zero if i == j
Infinity otherwise

The algorithm then considers each vertex as a possible intermediate point.

package main

import "fmt"

func floydWarshall(dist [][]int) {
	n := len(dist)

	for k := 0; k < n; k++ {
		for i := 0; i < n; i++ {
			for j := 0; j < n; j++ {
				if dist[i][k] == INF || dist[k][j] == INF {
					continue
				}

				candidate := dist[i][k] + dist[k][j]

				if candidate < dist[i][j] {
					dist[i][j] = candidate
				}
			}
		}
	}
}

const INF = int(1e9)

func main() {
	dist := [][]int{
		{0, 3, 8, INF},
		{INF, 0, INF, 2},
		{INF, INF, 0, 1},
		{INF, INF, INF, 0},
	}

	floydWarshall(dist)

	for _, row := range dist {
		fmt.Println(row)
	}
}

Output:

[0 3 8 5]
[INF 0 INF 2]
[INF INF 0 1]
[INF INF INF 0]

The shortest distance from A to D becomes:

A -> B -> D

with total cost:

3 + 2 = 5

Understanding the Dynamic Programming State

The key insight is surprisingly elegant.

Suppose we number vertices:

0 1 2 3

Define:

dp[i][j]

as the shortest path from i to j.

Now imagine processing vertices one at a time.

When considering vertex:

there are only two possibilities:

Case 1: Do Not Use k

The existing path remains best.

dp[i][j]

does not change.

Case 2: Use k

The path becomes:

i -> k -> j

with cost:

dp[i][k] + dp[k][j]

Therefore:

dp[i][j] =
min(
    dp[i][j],
    dp[i][k] + dp[k][j]
)

This single recurrence completely defines the algorithm.

Visualizing the Update

Suppose the current matrix contains:

From	To	Cost
A	C	10
A	B	3
B	C	2

Currently:

distance[A][C] = 10

When processing vertex B:

A -> B -> C

has cost:

3 + 2 = 5

Since:

5 < 10

the matrix is updated.

From	To	Old	New
A	C	10	5

This process repeats for every pair of vertices and every possible intermediate vertex.

Why the Triple Loop Works

The structure:

for k := 0; k < n; k++ {
	for i := 0; i < n; i++ {
		for j := 0; j < n; j++ {
			...
		}
	}
}

is not arbitrary.

The outer loop defines the set of vertices allowed as intermediate nodes.

After processing:

k = 0

all shortest paths using vertex 0 are known.

After processing:

k = 1

all shortest paths using vertices:

0, 1

are known.

After processing:

k = n - 1

all vertices are available as intermediates.

At that point every shortest path has been discovered.

Path Reconstruction

The distance matrix provides costs but not routes.

To reconstruct paths, maintain a second matrix:

next[i][j]

which stores the next vertex on the shortest path.

Initialization:

if edge exists {
	next[i][j] = j
}

Whenever a better path is discovered:

next[i][j] = next[i][k]

After the algorithm completes, paths can be reconstructed by repeatedly following the next matrix.

For example:

A -> B -> D

becomes:

next[A][D] = B
next[B][D] = D

Detecting Negative Cycles

Floyd-Warshall can also detect negative cycles.

Consider:

A -> B = 1
B -> C = -4
C -> A = 2

The cycle cost is:

1 + (-4) + 2 = -1

After the algorithm finishes, examine the diagonal.

distance[i][i]

Normally:

distance[i][i] = 0

If a negative cycle exists:

distance[i][i] < 0

for some vertex i.

Detection is therefore simple:

for i := 0; i < n; i++ {
	if dist[i][i] < 0 {
		fmt.Println("negative cycle")
	}
}

Discussion

Floyd-Warshall differs significantly from Dijkstra and Bellman-Ford.

Dijkstra explores vertices.

Bellman-Ford relaxes edges.

Floyd-Warshall improves pairs of vertices.

This perspective makes the algorithm unusually flexible.

Because it works directly on a matrix, many graph problems can be solved by modifying the recurrence rather than redesigning the algorithm.

Variants exist for:

Reachability
Transitive closure
Path counting
Maximum bottleneck paths
Longest paths in DAG-derived systems
Graph similarity analysis

The algorithm's structure remains largely unchanged.

Complexity

The algorithm performs:

n × n × n

updates.

Operation	Complexity
Matrix initialization	O(V²)
Dynamic programming	O(V³)
Memory usage	O(V²)
Path reconstruction	O(path length)

Compared with repeated Dijkstra:

Algorithm	Complexity
Floyd-Warshall	O(V³)
Dijkstra from every vertex	O(V(E log V))

For dense graphs, Floyd-Warshall is often competitive or simpler.

For sparse graphs, repeated Dijkstra is usually faster.

Common Mistakes

Do not forget to initialize:

dist[i][i] = 0

for every vertex.

Do not perform arithmetic involving infinity values without checking reachability.

if dist[i][k] == INF || dist[k][j] == INF {
	continue
}

Otherwise overflow can occur.

Do not assume Floyd-Warshall returns paths automatically. A separate reconstruction matrix is required.

Do not use Floyd-Warshall on very large sparse graphs. The cubic runtime becomes prohibitive.

Practical Applications

Floyd-Warshall appears in:

Transportation networks
Routing tables
Network latency analysis
Social network analytics
Database query optimization
Dependency analysis
Reachability systems
Geographic information systems

Its simplicity often outweighs its theoretical cost when graph sizes remain moderate.

Takeaway

Floyd-Warshall computes shortest paths between every pair of vertices using a remarkably compact dynamic programming recurrence. By considering each vertex as a potential intermediate node, the algorithm gradually improves a distance matrix until all shortest paths are known. Although its O(V³) runtime limits scalability, its simplicity, support for negative edge weights, and natural all-pairs formulation make it one of the most elegant graph algorithms ever developed.