7.18 Skip Lists

Balanced trees provide logarithmic performance through carefully maintained structure. AVL trees track heights. Red-black trees track colors. B-trees manage node occupancy. All of these structures require specialized balancing logic.

Skip lists achieve similar performance with a radically different idea.

Instead of maintaining a balanced tree, a skip list maintains multiple linked lists layered on top of one another. Higher levels contain progressively fewer elements, allowing searches to skip large portions of the data.

The result is a surprisingly simple probabilistic data structure that offers expected logarithmic performance while being much easier to implement than many balanced trees.

Skip lists are widely used in databases, caches, distributed systems, concurrent data structures, and storage engines.

Problem

Suppose values are stored in a sorted linked list.

10 -> 20 -> 30 -> 40 -> 50 -> 60 -> 70

Searching for:

requires:

Six comparisons.

Complexity:

O(n)

The list preserves ordering but provides no efficient way to skip ahead.

We need a structure that maintains linked-list simplicity while improving search performance.

Solution

Add shortcut layers.

Level 0 contains every value.

10 -> 20 -> 30 -> 40 -> 50 -> 60 -> 70

Level 1 contains selected values.

10 ------> 30 ------> 50 ------> 70

Level 2 contains even fewer values.

10 -----------------> 50

Searching starts at the highest level.

Large portions of the list are skipped immediately.

The structure resembles express lanes on a highway.

Visualizing a Skip List

Example:

Level 2:
10 ----------------------> 50

Level 1:
10 ------> 30 ------> 50 ------> 70

Level 0:
10 -> 20 -> 30 -> 40 -> 50 -> 60 -> 70

Search for:

Path:

10
50
60

Rather than visiting every node, the search jumps across large sections.

This produces logarithmic expected behavior.

Node Structure

A skip-list node stores multiple forward pointers.

type Node struct {
	Key     int
	Forward []*Node
}

Example:

Node 50

Level 2 -> nil
Level 1 -> 70
Level 0 -> 60

Each node participates in one or more levels.

The number of levels varies per node.

Levels and Probability

When a new node is inserted, its height is chosen randomly.

Example probabilities:

Level 1: 100%
Level 2: 50%
Level 3: 25%
Level 4: 12.5%

Each additional level is obtained by flipping a coin.

Heads -> add another level
Tails -> stop

This creates a pyramid-like structure naturally.

Most nodes appear only at the bottom.

A few appear at higher levels.

Very few reach the top.

Why Random Heights Work

Suppose:

1,000,000 nodes

Expected counts:

Level 1: 1,000,000
Level 2:   500,000
Level 3:   250,000
Level 4:   125,000

and so on.

Each level contains roughly half as many nodes as the previous one.

This mirrors the search-space reduction achieved by balanced trees.

The structure emerges automatically from randomness rather than explicit balancing.

Searching

Search begins at the highest level.

Consider:

Level 2:
10 -----------------> 50

Level 1:
10 ------> 30 ------> 50 ------> 70

Level 0:
10 -> 20 -> 30 -> 40 -> 50 -> 60 -> 70

Search for:

Start at:

Move right until moving further would exceed the target.

Drop one level.

Move right.

Found.

The search repeatedly moves:

right
down
right
down

until the target is located.

Search Algorithm

Simplified implementation:

func Search(
	head *Node,
	key int,
	maxLevel int,
) bool {

	current := head

	for level := maxLevel - 1;
		level >= 0;
		level-- {

		for current.Forward[level] != nil &&
			current.Forward[level].Key < key {

			current =
				current.Forward[level]
		}
	}

	current = current.Forward[0]

	return current != nil &&
		current.Key == key
}

The algorithm resembles traversing a grid.

Horizontal movement skips values.

Vertical movement refines the search.

Insertion

Insertion has two phases.

Phase 1

Locate insertion positions at every level.

Example:

10 -> 20 -> 40 -> 50

Insert:

Search determines where the new node belongs.

Phase 2

Generate a random level.

Example:

Level 2

Insert the node into:

Level 0
Level 1

Pointer updates connect the new node into each level.

No rotations are required.

No balance factors are computed.

No colors are tracked.

Example Insertion

Initial:

Level 1:
10 ------> 50

Level 0:
10 -> 20 -> 50

Insert:

Randomly assigned:

Level 2

Result:

Level 1:
10 -> 30 -> 50

Level 0:
10 -> 20 -> 30 -> 50

The structure remains ordered automatically.

Deletion

Deletion is similarly straightforward.

Locate the node.

Update pointers at every level where the node appears.

Example:

Level 1:
10 -> 30 -> 50

Level 0:
10 -> 20 -> 30 -> 50

Delete:

Result:

Level 1:
10 ------> 50

Level 0:
10 -> 20 -> 50

No rebalancing is required.

Randomization handles long-term structure.

Skip Lists and Binary Search

A useful intuition is that each higher level acts as a progressively coarser index.

Level 3
Level 2
Level 1
Level 0

This resembles repeatedly halving the search space.

A balanced tree does this vertically.

A skip list does it horizontally.

Both achieve:

O(log n)

expected search performance.

Concurrent Systems

One reason skip lists are popular is concurrency.

Rotations in balanced trees can be difficult to coordinate across multiple threads.

Skip lists require mostly local pointer updates.

This makes lock-free and highly concurrent implementations easier.

Several high-performance systems exploit this property.

Examples include:

Concurrent key-value stores
In-memory databases
Scheduling systems
Distributed coordination services

Skip Lists in Databases

Several storage engines use skip lists as in-memory indexes.

Reasons include:

Simple implementation
Good cache behavior
Easy concurrency support
Expected logarithmic performance

A skip list often serves as the mutable in-memory layer while persistent data resides in B-trees or sorted files.

This combination appears in modern LSM-tree storage engines.

Skip Lists Versus AVL Trees

Feature	Skip List	AVL Tree
Balance	Randomized	Deterministic
Rotations	None	Required
Search	Expected O(log n)	O(log n)
Insert	Expected O(log n)	O(log n)
Delete	Expected O(log n)	O(log n)
Implementation	Simple	Moderate

AVL trees provide stronger guarantees.

Skip lists provide simpler code.

Skip Lists Versus Red-Black Trees

Feature	Skip List	Red-Black Tree
Balancing	Randomized	Color rules
Height maintenance	Automatic	Explicit
Worst-case guarantee	No	Yes
Concurrency	Excellent	More difficult
Code complexity	Lower	Higher

Many engineers choose skip lists specifically because of implementation simplicity.

Skip Lists Versus B-Trees

Feature	Skip List	B-Tree
Storage optimization	Moderate	Excellent
Memory usage	Higher	Lower
Disk access efficiency	Moderate	Excellent
Range scans	Good	Excellent
In-memory workloads	Excellent	Good

B-trees dominate storage indexes.

Skip lists excel as in-memory structures.

Expected Complexity

With probability:

p = 0.5

expected height becomes:

O(log n)

Expected operations:

Operation	Complexity
Search	O(log n)
Insert	O(log n)
Delete	O(log n)

Space complexity:

O(n)

with a modest constant-factor overhead for additional pointers.

Common Mistakes

A common misconception is that skip lists are balanced trees. They are not trees at all. They are layered linked lists.

Another mistake is expecting strict worst-case guarantees. Skip lists provide expected performance through probability rather than deterministic balancing.

A third mistake is using poor random-number generation. The level distribution directly affects performance.

Finally, some implementations choose a maximum level that is too small. The upper levels must be large enough to support the expected collection size.

Takeaway

Skip lists achieve expected logarithmic performance through probabilistic layering rather than explicit balancing. By augmenting a sorted linked list with progressively sparser shortcut levels, they create a structure that behaves similarly to a balanced search tree while remaining conceptually simple. Their ease of implementation, strong practical performance, and suitability for concurrent environments have made them a popular alternative to traditional balanced trees in modern systems software.