8.16 Persistent Trees

Most data structures answer questions about the present.

8.16 Persistent Trees

Most data structures answer questions about the present.

A value changes, the old value disappears, and the structure reflects only the latest state.

Many real systems, however, need access to the past.

Examples include:

  • Version control systems
  • Database snapshots
  • Financial ledgers
  • Historical analytics
  • Time-travel debugging
  • Undo functionality
  • Immutable functional data structures

Suppose a value changes:

Version 1:
[1 2 3 4]

Version 2:
[1 2 10 4]

Version 3:
[1 2 10 7]

A traditional Segment Tree stores only Version 3.

A Persistent Tree preserves all versions simultaneously.

Queries can be executed against any historical state without copying the entire structure.

This capability is surprisingly powerful and appears frequently in advanced algorithmic systems.

Problem

You need to support updates while preserving access to previous versions of the data.

Copying the entire structure after every modification is too expensive.

Solution

Reuse unchanged portions of the tree.

Consider a Segment Tree:

                 A
              /     \\n             B       C
           /  \     /  \\n          D    E   F    G

Suppose an update affects only node:

F

A naive copy duplicates the entire tree.

A Persistent Tree duplicates only the nodes along the update path.

Version 1

                 A
              /     \\n             B       C
           /  \     /  \\n          D    E   F    G

Version 2

                 A'
              /      \\n             B        C'
           /  \      /  \\n          D    E    F'   G

The unchanged nodes:

B
D
E
G

are shared.

Only:

A
C
F

are duplicated.

This technique is called path copying.

Discussion

Persistence relies on immutability.

Instead of modifying a node:

node.Value = 42

create a new node:

newNode := &Node{
    Value: 42,
}

and reuse every unaffected subtree.

Each update produces a new root.

The old root remains valid.

The structure naturally forms a version tree.

Persistent Segment Tree Node

A pointer-based representation is common.

type Node struct {
    Sum   int
    Left  *Node
    Right *Node
}

Unlike array-based Segment Trees, persistence benefits from explicit pointers because individual nodes are copied selectively.

Building the Initial Version

Construction resembles an ordinary Segment Tree.

func build(
    values []int,
    left int,
    right int,
) *Node {
    if left == right {
        return &Node{
            Sum: values[left],
        }
    }

    mid := (left + right) / 2

    leftChild :=
        build(values, left, mid)

    rightChild :=
        build(
            values,
            mid+1,
            right,
        )

    return &Node{
        Sum:
            leftChild.Sum +
            rightChild.Sum,
        Left:  leftChild,
        Right: rightChild,
    }
}

This produces Version 0.

Every future version begins from this root.

Persistent Updates

The key operation is update.

Instead of modifying existing nodes, create replacements along the update path.

func update(
    node *Node,
    left int,
    right int,
    index int,
    value int,
) *Node {
    if left == right {
        return &Node{
            Sum: value,
        }
    }

    mid := (left + right) / 2

    if index <= mid {
        newLeft := update(
            node.Left,
            left,
            mid,
            index,
            value,
        )

        return &Node{
            Sum:
                newLeft.Sum +
                node.Right.Sum,
            Left:  newLeft,
            Right: node.Right,
        }
    }

    newRight := update(
        node.Right,
        mid+1,
        right,
        index,
        value,
    )

    return &Node{
        Sum:
            node.Left.Sum +
            newRight.Sum,
        Left:  node.Left,
        Right: newRight,
    }
}

Notice the sharing.

One child remains unchanged and is reused directly.

Only the path from root to leaf is recreated.

Storing Versions

Each update produces a new root.

versions := []*Node{}

root0 := build(values, 0, n-1)

versions =
    append(versions, root0)

root1 :=
    update(
        root0,
        0,
        n-1,
        3,
        100,
    )

versions =
    append(versions, root1)

Now:

versions[0]

and:

versions[1]

represent different historical states.

Both remain available.

Historical Queries

Queries work exactly as before.

The only difference is choosing the version.

func query(
    node *Node,
    left int,
    right int,
    ql int,
    qr int,
) int

Example:

query(
    versions[0],
    0,
    n-1,
    2,
    5,
)

queries the original state.

query(
    versions[7],
    0,
    n-1,
    2,
    5,
)

queries Version 7.

No special logic is required.

The version is determined entirely by the chosen root.

Visualizing Sharing

Suppose:

Version 0:
[1 2 3 4]

Update:

Index 2 = 10

Version 1:

[1 2 10 4]

Only one path changes.

Root
 └── Right
      └── Left

Everything else is shared.

Memory usage grows much more slowly than full copying.

This efficiency is the reason persistence is practical.

Why Persistence Works

A traditional update modifies:

old state

into:

new state

destroying historical information.

Persistence changes the model.

old state
        \\n         new state

Both versions continue to exist.

Updates become:

Create new path
Reuse old subtrees

rather than:

Overwrite existing nodes

This distinction is fundamental.

Persistent Arrays

Conceptually, a Persistent Segment Tree acts like an immutable array.

Example:

Version 0:
[1 2 3 4]

Version 1:
[1 2 5 4]

Version 2:
[1 2 5 9]

Each update creates a new array version.

The difference is efficiency.

Naively copying arrays costs:

O(n)

per update.

Persistence costs:

O(log n)

per update.

K-th Smallest Queries

One of the most famous applications uses persistent frequency trees.

Suppose:

Array:
5 2 7 1 9

Build versions:

Version 0:
empty

Version 1:
5

Version 2:
5 2

Version 3:
5 2 7

...

Each version represents a prefix.

Comparing two versions reveals frequencies inside a range.

This enables:

k-th smallest element
in interval [L,R]

queries.

The technique appears frequently in competitive programming and database indexing.

Persistent Trees and Functional Programming

Persistence is natural in functional languages.

Languages such as:

  • Haskell
  • OCaml
  • F#
  • Clojure

favor immutable structures.

Updating data naturally creates new versions.

Persistent trees fit perfectly into this model.

Many functional collections use similar ideas internally.

Time Travel Queries

Persistence enables queries that would otherwise be difficult.

Examples:

What was the value
before yesterday's update?

What did the hierarchy
look like last month?

What was the state
before commit 500?

No reconstruction is necessary.

The historical version already exists.

The query simply starts from the corresponding root.

Partial Persistence vs Full Persistence

Most persistent segment trees are partially persistent.

Meaning:

Old versions:
read only

Newest version:
modifiable

Full persistence allows updates to any historical version.

Version 3
     \\n      Version 8

Version 3
     \\n      Version 9

The versions form a branching tree.

This resembles source-control systems such as Git.

The implementation becomes slightly more complex, but the underlying ideas remain the same.

Persistence and Memory

Persistence trades memory for historical access.

Each update copies:

O(log n)

nodes.

After:

m updates

memory usage becomes:

O(n + m log n)

This is dramatically smaller than:

O(mn)

which would result from copying the entire structure.

The sharing of unchanged subtrees is essential.

Complexity Analysis

Let:

  • n = number of elements
  • m = number of updates

Building:

Operation Complexity
Initial tree O(n)

Updates:

Operation Complexity
Persistent update O(log n)

Queries:

Operation Complexity
Range query O(log n)

Memory:

Structure Complexity
Initial tree O(n)
Each update O(log n)
Total O(n + m log n)

This efficiency makes persistence practical even for large systems.

Common Mistakes

A common mistake is accidentally modifying shared nodes.

For example:

node.Left = ...

changes every version that references that node.

Persistent structures must treat existing nodes as immutable.

Another mistake is copying entire subtrees unnecessarily.

Only nodes on the update path require duplication.

Everything else should be shared.

A third mistake is losing version roots.

Without the root pointer:

Version 17

becomes unreachable.

Persistent systems typically maintain an explicit version list.

Finally, many implementations underestimate memory usage. While persistence is efficient, millions of updates still create millions of additional nodes.

Takeaway

Persistent Trees preserve historical versions while supporting efficient updates and queries. By copying only the nodes along an update path and sharing unchanged subtrees, they provide time-travel capabilities with logarithmic update costs. This technique forms the foundation of immutable data structures, historical analytics, version-control systems, and many advanced range-query algorithms.