8.1 Tree Representation

Trees are usually introduced with drawings: circles connected by lines, one circle at the top, several below it, then more below those. That picture is useful, but an algorithm cannot run on a picture. The first practical question is how to represent the tree in memory.

A good representation makes the common operations cheap. If the algorithm needs to visit every child of a node, children should be easy to enumerate. If it needs to move from a node to its parent, parent links should be stored or computed. If it needs subtree ranges, an Euler tour index may be better than explicit pointer chasing.

This recipe shows the main tree representations and when to use each one.

Problem

You need to store a tree so that algorithms such as traversal, subtree queries, parent lookup, or path processing can run efficiently.

Solution

Choose the representation that matches the operations you need.

For a rooted tree with nodes numbered from 0 to n - 1, the most common representation is an adjacency list:

type Tree struct {
    Children [][]int
}

func NewTree(n int) *Tree {
    children := make([][]int, n)
    return &Tree{Children: children}
}

func (t *Tree) AddChild(parent, child int) {
    t.Children[parent] = append(t.Children[parent], child)
}

This form is compact, easy to traverse, and works well when every edge is directed from parent to child.

For an undirected tree, store each edge both ways:

type GraphTree struct {
    Adj [][]int
}

func NewGraphTree(n int) *GraphTree {
    adj := make([][]int, n)
    return &GraphTree{Adj: adj}
}

func (t *GraphTree) AddEdge(a, b int) {
    t.Adj[a] = append(t.Adj[a], b)
    t.Adj[b] = append(t.Adj[b], a)
}

When traversing an undirected tree, pass the parent node so that the traversal does not immediately walk back along the edge it came from:

func DFS(adj [][]int, node, parent int) {
    for _, next := range adj[node] {
        if next == parent {
            continue
        }
        DFS(adj, next, node)
    }
}

For binary trees, a pointer-based representation is often clearer:

type Node struct {
    Value int
    Left  *Node
    Right *Node
}

This representation works well for recursive algorithms on binary search trees, expression trees, heaps described as linked nodes, and interview-style tree problems.

For array-backed binary trees, especially heaps, use implicit indexing:

func Left(i int) int {
    return 2*i + 1
}

func Right(i int) int {
    return 2*i + 2
}

func Parent(i int) int {
    return (i - 1) / 2
}

With zero-based indexing, the children of index i are 2*i + 1 and 2*i + 2. The parent of a non-root node is (i - 1) / 2.

Discussion

A tree has n nodes and n - 1 edges, assuming it is connected and acyclic. This fact makes adjacency lists a natural representation. They store exactly the edges that exist, so the memory cost is linear.

An adjacency matrix almost always wastes memory for trees:

matrix := make([][]bool, n)
for i := range matrix {
    matrix[i] = make([]bool, n)
}

This uses O(n²) space even though a tree has only O(n) edges. It may be acceptable for very small n, but it is rarely the right default.

The rooted and unrooted versions differ in an important way. In a rooted tree, every edge has a direction from parent to child. Traversal is straightforward because there is no edge back to the parent unless you explicitly store one. In an undirected tree, the edges have no built-in direction, so traversal needs a parent parameter or a visited array.

Here is a complete example that builds an undirected tree and computes the size of every subtree after choosing node 0 as the root:

package main

import "fmt"

func subtreeSizes(adj [][]int, root int) []int {
    size := make([]int, len(adj))

    var dfs func(node, parent int)
    dfs = func(node, parent int) {
        size[node] = 1

        for _, next := range adj[node] {
            if next == parent {
                continue
            }

            dfs(next, node)
            size[node] += size[next]
        }
    }

    dfs(root, -1)
    return size
}

func main() {
    n := 7
    adj := make([][]int, n)

    addEdge := func(a, b int) {
        adj[a] = append(adj[a], b)
        adj[b] = append(adj[b], a)
    }

    addEdge(0, 1)
    addEdge(0, 2)
    addEdge(1, 3)
    addEdge(1, 4)
    addEdge(2, 5)
    addEdge(2, 6)

    fmt.Println(subtreeSizes(adj, 0))
}

The output is:

[7 3 3 1 1 1 1]

The root has subtree size 7, nodes 1 and 2 each have subtree size 3, and the leaves have subtree size 1.

Choosing a Representation

Use adjacency lists when the tree is large, sparse, or general. Most graph-style tree algorithms use this representation.

Use child lists when the tree is already rooted and the algorithm only moves downward.

Use parent arrays when the main operation is moving upward:

parent := []int{-1, 0, 0, 1, 1, 2, 2}

In this array, parent[i] gives the parent of node i, and -1 marks the root.

Use pointer nodes when the tree is naturally recursive and each node owns direct references to its children.

Use array indexing when the tree has a compact complete-tree shape, as in binary heaps.

Edge Cases

A tree with one node has no edges. Traversal code must still handle it.

An empty tree needs an explicit convention. Some APIs use nil, some use n = 0, and some disallow empty trees.

A deep tree can overflow the call stack in languages with limited recursion depth. Go can grow stacks, but extremely deep recursion can still be expensive. For adversarial input, prefer an iterative traversal.

A malformed input may contain cycles, duplicate edges, disconnected components, or node IDs outside the valid range. If the tree comes from an untrusted source, validate it before running tree-specific algorithms.

A node may have many children. Do not assume binary structure unless the problem states it.

Common Mistakes

The most common bug in undirected tree traversal is forgetting to skip the parent. That turns a simple DFS into infinite recursion.

Another common bug is mixing rooted and unrooted assumptions. If you store only children, you cannot move upward unless you also store parents. If you store undirected edges, the algorithm must impose a root during traversal.

Off-by-one indexing appears frequently when input nodes are numbered from 1 to n, but arrays are indexed from 0 to n - 1. Normalize the input at the boundary:

a--
b--

Another subtle error is using an adjacency matrix because it feels simple. For large trees, this changes a linear-memory algorithm into a quadratic-memory program.

Takeaway

Tree representation is an algorithmic choice, not just an implementation detail. Store the tree in the shape your algorithm needs: adjacency lists for general trees, child lists for rooted downward traversal, parent arrays for upward movement, pointers for recursive binary structures, and arrays for compact complete trees. A good representation keeps traversal simple, prevents accidental quadratic costs, and makes later algorithms easier to state and prove.