9.4 Edge Lists
You need a graph representation that is simple to construct, compact to store, and efficient for algorithms that process edges directly.
9.4 Edge Lists
Problem
You need a graph representation that is simple to construct, compact to store, and efficient for algorithms that process edges directly.
Not every graph algorithm spends most of its time traversing neighbors. Some algorithms repeatedly sort edges, scan edges, filter edges, or process edges as independent objects. In these situations, adjacency lists and adjacency matrices may introduce unnecessary complexity.
You need a representation that treats the graph as a collection of edges.
Solution
Store the graph as a list of edge records.
For a directed graph:
A → B
A → C
B → D
C → D
Represent the graph as:
edges = [
("A", "B"),
("A", "C"),
("B", "D"),
("C", "D"),
]
Each tuple represents one edge.
For weighted graphs:
edges = [
("A", "B", 5),
("A", "C", 2),
("B", "D", 7),
("C", "D", 1),
]
The third value stores the edge weight.
This representation requires only one record per edge.
Discussion
An edge list is the simplest graph representation.
Unlike adjacency lists, it does not organize edges by source vertex.
Unlike adjacency matrices, it does not allocate storage for nonexistent edges.
Instead, it records exactly the information present in the graph and nothing more.
For a graph containing:
V vertices
E edges
the memory requirement is:
O(E)
plus any storage needed for vertex metadata.
This efficiency makes edge lists attractive for graph loading, serialization, storage systems, file formats, and edge-oriented algorithms.
Many real-world graph datasets are distributed as edge lists:
source,target
A,B
A,C
B,D
C,D
or:
source,target,weight
A,B,5
A,C,2
B,D,7
Importing such data often requires no transformation.
Edge Lists as a Database Table
An edge list resembles a relational table.
| Source | Target | Weight |
|---|---|---|
| A | B | 5 |
| A | C | 2 |
| B | D | 7 |
| C | D | 1 |
Many graph-processing systems internally use variations of this structure.
Operations such as filtering become straightforward:
heavy_edges = [
edge
for edge in edges
if edge[2] > 5
]
Result:
[
("B", "D", 7)
]
This style works naturally with data-processing pipelines.
Directed and Undirected Graphs
For directed graphs, each edge appears once.
edges = [
(0, 1),
(1, 2),
]
The edge:
0 → 1
is distinct from:
1 → 0
For undirected graphs, two approaches are common.
Store each edge once:
edges = [
(0, 1),
(0, 2),
]
and interpret every edge as bidirectional.
Or store both directions explicitly:
edges = [
(0, 1),
(1, 0),
(0, 2),
(2, 0),
]
The first approach is more compact.
The second approach simplifies algorithms that assume directed edges.
The correct choice depends on the processing pipeline.
Recipe: Building an Edge List
Suppose input arrives as pairs:
connections = [
("A", "B"),
("A", "C"),
("B", "D"),
]
Store them directly:
edges = list(connections)
Nothing else is required.
For weighted edges:
connections = [
("A", "B", 5),
("A", "C", 2),
("B", "D", 7),
]
edges = list(connections)
Construction cost is:
O(E)
where E is the number of edges.
Recipe: Extracting Vertices
An edge list does not automatically store vertices.
You often need to reconstruct them.
def vertices(edges):
result = set()
for source, target in edges:
result.add(source)
result.add(target)
return result
Example:
edges = [
("A", "B"),
("A", "C"),
("B", "D"),
]
Result:
{"A", "B", "C", "D"}
For weighted edges:
def vertices(edges):
result = set()
for source, target, _ in edges:
result.add(source)
result.add(target)
return result
This operation requires scanning every edge.
Recipe: Convert Edge List to Adjacency List
Many algorithms prefer adjacency lists.
Conversion is straightforward.
def build_adjacency_list(edges):
graph = {}
for source, target in edges:
graph.setdefault(source, []).append(target)
graph.setdefault(target, [])
return graph
Example:
edges = [
("A", "B"),
("A", "C"),
("B", "D"),
]
Produces:
{
"A": ["B", "C"],
"B": ["D"],
"C": [],
"D": []
}
This conversion runs in:
O(E)
time.
Many graph-processing pipelines load data as edge lists and then transform them into adjacency structures before traversal.
Algorithms That Naturally Use Edge Lists
Several important algorithms operate directly on edge collections.
Bellman-Ford
Bellman-Ford repeatedly relaxes every edge.
for source, target, weight in edges:
relax(source, target, weight)
No adjacency structure is required.
The algorithm naturally processes edges one by one.
Kruskal's Algorithm
Kruskal begins by sorting edges.
edges.sort(key=lambda edge: edge.weight)
An edge list is already the ideal representation.
No conversion is necessary.
Union-Find Workloads
Offline connectivity problems frequently process edges in sequence:
for u, v in edges:
union(u, v)
Again, the edge list directly matches the algorithm.
Streaming Graph Processing
When graphs arrive continuously:
edge 1
edge 2
edge 3
...
an edge list supports incremental processing without rebuilding adjacency structures.
Traversal Limitations
Suppose you want the neighbors of vertex A.
With an adjacency list:
graph["A"]
returns immediately.
With an edge list, every edge must be inspected:
neighbors = []
for source, target in edges:
if source == "A":
neighbors.append(target)
Cost:
O(E)
This becomes expensive for large graphs.
For this reason, DFS and BFS almost always convert edge lists into adjacency lists first.
Repeated neighbor queries are a warning sign that another representation may be more appropriate.
Complexity Characteristics
Let:
V= number of verticesE= number of edges
An edge list has the following properties:
| Operation | Complexity |
|---|---|
| Memory usage | O(E) |
| Add edge | O(1) amortized |
| Iterate all edges | O(E) |
| Find neighbors | O(E) |
| Check edge existence | O(E) |
| DFS traversal | O(VE) naive |
| Edge sorting | O(E log E) |
The key observation is that edge-centric operations are efficient, while vertex-centric operations are expensive.
Common Pitfalls
Do not assume the edge list contains all vertices. Isolated vertices may not appear at all.
Consider:
vertices = ["A", "B", "C"]
edges = [
("A", "B")
]
Vertex C exists even though no edge references it.
Many implementations accidentally lose isolated vertices.
Do not perform repeated neighbor lookups directly on large edge lists.
Do not forget directionality when processing directed graphs.
Do not duplicate undirected edges unless the algorithm expects both directions.
Do not assume edge ordering has meaning unless the input specification guarantees it.
Recipe: Edge Object Representation
As graphs become larger, explicit edge objects often improve readability.
from dataclasses import dataclass
@dataclass
class Edge:
source: int
target: int
weight: int = 1
Usage:
edges = [
Edge(0, 1, 5),
Edge(0, 2, 2),
Edge(1, 3, 7),
]
This approach makes algorithms such as Kruskal and Bellman-Ford easier to read.
for edge in edges:
print(edge.source, edge.target, edge.weight)
Many production graph libraries use structures similar to this.
Choosing Between Graph Representations
At this point, three major representations have appeared.
| Requirement | Best Choice |
|---|---|
| DFS and BFS | Adjacency list |
| Sparse graph | Adjacency list |
| Fast edge lookup | Adjacency matrix |
| Dense graph | Adjacency matrix |
| Edge sorting | Edge list |
| Bellman-Ford | Edge list |
| Kruskal | Edge list |
| Graph storage | Edge list |
| Graph serialization | Edge list |
| Lowest memory overhead | Edge list |
No single representation dominates every workload.
Most practical systems use more than one representation during different stages of processing.
Takeaway
An edge list stores a graph as a collection of edge records and provides the simplest possible representation of graph structure. It is compact, easy to serialize, and naturally supports edge-oriented algorithms such as Bellman-Ford, Kruskal, and streaming graph processing. Its main weakness is neighbor lookup, which requires scanning the edge collection. Whenever graph traversal becomes the primary workload, converting the edge list into an adjacency list is usually the next step.