Skip to content

Start with the Operations

A data structure is never just a container.

Performance Tradeoffs

A data structure is never just a container.

It is also a set of tradeoffs.

When you choose between ArrayList, HashMap, StringHashMap, linked lists, queues, bit sets, ring buffers, and trees, you are choosing how your program will use memory, CPU time, allocation, cache locality, and complexity.

The right choice depends on what the program does most often.

Start with the Operations

Before choosing a collection, ask what operations matter.

Does the program mostly append?

Does it search by key?

Does it remove from the middle?

Does it need sorted order?

Does it need stable pointers?

Does it run with a fixed maximum size?

Does it allocate often?

Does it run in a tight loop?

A collection that is good for one operation may be poor for another.

For example, an ArrayList is excellent for appending and iteration, but removing from the front can be expensive. A hash map is excellent for key lookup, but it does not preserve order. A ring buffer is excellent for bounded queues, but it cannot grow unless you build resizing into it.

Big-O Is Only the First Approximation

You will often see data structures described with Big-O notation.

ArrayList append: usually O(1)
HashMap lookup: usually O(1)
Linked list insertion after node: O(1)
Binary search tree lookup: O(log n) if balanced

This is useful, but incomplete.

Big-O ignores constant factors, memory layout, allocation cost, branch prediction, cache behavior, and implementation details.

For real Zig programs, those details matter.

A linked list may have O(1) insertion, but each node may require a heap allocation and a pointer chase. An ArrayList may shift items during removal, but it stores memory densely, which is often much faster in practice.

Cache Locality

Modern CPUs are fast when data is close together.

An ArrayList stores items contiguously:

[10][20][30][40][50]

When the CPU reads 10, it often loads nearby values too. Reading 20, 30, and 40 can be cheap.

A linked list stores nodes wherever allocation placed them:

[10] -> somewhere else -> somewhere else -> somewhere else

Each pointer may jump to a different memory location.

This can be slow even if the algorithm looks good on paper.

For many workloads, a simple dense array beats a clever pointer-heavy structure.

Allocation Cost

Heap allocation is not free.

This is cheap:

try list.append(value);

most of the time, because ArrayList reuses capacity.

This may be expensive:

const node = try allocator.create(Node);

if it happens for every inserted item.

A linked list often performs one allocation per node.

An ArrayList performs occasional larger allocations.

A ring buffer with caller-provided storage may perform no allocation after setup.

This matters in parsers, network services, games, embedded systems, and hot loops.

Memory Overhead

Each collection has overhead.

An array stores only the elements:

[value][value][value]

A linked list node stores the value plus one or more pointers:

[value][next]

A doubly linked list stores even more:

[prev][value][next]

A hash map stores keys, values, metadata, empty slots, and extra capacity.

A bit set stores one bit per possible value, which is extremely compact when the domain is small.

Memory overhead matters when there are many items.

ArrayList Tradeoffs

ArrayList is often the best default collection.

It is good for:

append-heavy workloads
fast iteration
index access
compact storage
sorting
passing data as slices

It is weaker for:

frequent insertion at the front
frequent removal from the middle
stable pointers across growth
fast lookup by key

Appending is usually fast because capacity grows ahead of length.

But when the list grows, the internal buffer may move. That means pointers into list.items may become invalid after append.

Use ArrayList when you want dense, simple, fast storage.

HashMap Tradeoffs

A hash map is good for lookup by key.

It is good for:

id -> object
name -> value
path -> file info
word -> count
token -> token kind

It is weaker for:

ordered iteration
small collections
stable iteration order
low memory overhead
range queries

Hash maps usually use more memory than arrays. They need empty capacity to stay fast.

They also need hashing and equality checks.

For a tiny collection with 5 or 10 items, a simple ArrayList scan may be faster and simpler than a hash map.

Use a hash map when lookup dominates and the collection is large enough to justify it.

StringHashMap Tradeoffs

StringHashMap is a specialized hash map for string keys.

It compares string bytes, not pointer addresses.

It is good for:

configuration keys
headers
keyword tables
symbol tables
file paths
word counts

It is weaker when:

keys are temporary buffers
ownership is unclear
sorted output is required
memory use must be minimal

The most important issue is ownership.

If the map stores string literals, cleanup is simple.

If the map stores heap-allocated string copies, you must free those keys yourself.

deinit() frees the map table. It does not automatically free every heap allocation referenced by keys or values.

Linked List Tradeoffs

Linked lists are useful, but they are often overused.

They are good for:

stable node addresses
splicing nodes
intrusive structures
allocator free lists
scheduler queues
cases where you already have node pointers

They are weak for:

fast iteration
index access
compact memory
simple ownership
CPU cache behavior

A linked list can remove a known node cheaply, but finding that node may require walking the list.

For beginner programs, use ArrayList first unless you have a specific reason to use a linked list.

Queue Tradeoffs

A queue can be implemented several ways.

An ArrayList with orderedRemove(0) is simple, but each removal shifts items.

An ArrayList with a head index avoids shifting, but old slots remain until compaction.

A ring buffer reuses fixed storage, but has fixed capacity.

A linked list queue can grow, but usually allocates per node.

Queue implementationStrengthWeakness
ArrayList + orderedRemove(0)Very simpleShifts on every dequeue
ArrayList + head indexNo shiftingNeeds cleanup or compaction
Ring bufferFast and boundedFixed capacity
Linked listStable nodesAllocation and poor locality

For most bounded queues, a ring buffer is the cleanest choice.

For unknown growth, an ArrayList with a head index is often enough.

Bit Set Tradeoffs

A bit set is excellent when values are small integer indexes.

It is good for:

visited flags
permissions
feature flags
sets of enum values
graph algorithms
compiler analysis

It is weak for:

string keys
large sparse numeric domains
ordered insertion history
associated values

A bit set tells you whether an index is present.

It does not store a value for that index.

If you need:

key -> value

use a map.

If you only need:

is key present?

and the key is a small integer, a bit set can be much more compact.

Ring Buffer Tradeoffs

A ring buffer is a queue with fixed circular storage.

It is good for:

bounded queues
audio streams
logging buffers
network buffers
embedded systems
producer-consumer handoff

It is weak when:

capacity is unknown
old items must never be dropped
random insertion is needed
sorted order is needed

Ring buffers are predictable. That is their main strength.

No surprise growth. No per-item allocation. No shifting.

The cost is that you must decide what happens when the buffer is full.

Reject new items, block, grow, or overwrite old items. The right answer depends on the program.

Tree Tradeoffs

Trees are useful when order and hierarchy matter.

They are good for:

sorted data
range queries
hierarchical data
syntax trees
file trees
indexes

They are weak when:

only direct lookup matters
tree balance is not maintained
ownership is complicated
iteration order does not matter

A plain binary search tree can become unbalanced.

If values are inserted in sorted order:

1, 2, 3, 4, 5

the tree may become a chain:

1
 \
  2
   \
    3
     \
      4

Then lookup becomes slow.

Production ordered maps usually use balanced trees or other index structures.

Small Data Often Wants Simple Code

For small collections, simple code often wins.

Example:

for (items) |item| {
    if (item.id == target_id) return item;
}

This linear scan may be perfectly fine for 10 or 20 items.

A hash map may add complexity, allocation, and ownership issues without improving real performance.

Do not choose a complex structure just because it has better theoretical lookup.

Measure when it matters.

Stable Pointers

Some collections move their elements.

ArrayList may move items when it grows.

Hash maps may move entries when they resize.

Ring buffers may overwrite slots.

Linked lists and heap-allocated trees usually keep node addresses stable until the node is destroyed.

This matters when other parts of your program store pointers to items.

If you need stable addresses, consider:

heap-allocated nodes
object pools
indices instead of pointers
handles with generation counters

In Zig, invalid pointers are a serious bug. Design around pointer lifetime explicitly.

Ownership and Cleanup Cost

Performance is not only speed.

A collection with unclear ownership costs you debugging time.

Ask:

Who owns the elements?
Who frees the keys?
Who frees the values?
Can insertion fail halfway?
Can replacement leak memory?
Can removal leave dangling pointers?

Sometimes a slightly slower collection with clear ownership is better than a faster one with fragile lifetime rules.

Zig rewards explicit ownership.

A Practical Decision Guide

Use ArrayList when you need a growable list, fast iteration, index access, or a slice.

Use HashMap when you need fast lookup by non-string keys.

Use StringHashMap when you need fast lookup by string keys.

Use a bit set when you need compact membership for small integer indexes.

Use a ring buffer when you need a bounded queue.

Use a linked list when you need stable nodes or intrusive structures.

Use a tree when you need hierarchy, sorted order, or range queries.

Common Beginner Mistakes

The first mistake is choosing by name instead of workload.

The second mistake is using a hash map for tiny data where a slice scan would be simpler.

The third mistake is using a linked list for general storage, then getting poor iteration performance.

The fourth mistake is forgetting that fast lookup can cost memory.

The fifth mistake is ignoring allocation behavior.

The sixth mistake is keeping pointers into containers that may resize.

A Useful Mental Model

A collection has a shape.

An ArrayList is a row.

[ ][ ][ ][ ]

A hash map is a lookup table.

key -> value

A linked list is a chain.

[ ] -> [ ] -> [ ]

A ring buffer is a circle.

0 -> 1 -> 2 -> 3 -> 0

A tree is a hierarchy.

      root
     /    \
  child  child

The shape determines the cost.

Summary

Performance tradeoffs are about matching the data structure to the workload.

There is no best collection for every case.

ArrayList is the best default for many beginner programs because it is simple, dense, and fast to iterate.

Hash maps are good for lookup.

Bit sets are good for compact membership.

Ring buffers are good for bounded queues.

Linked lists are useful when stable nodes and splicing matter.

Trees are useful for hierarchy and ordered data.

Choose the simplest structure that matches the operations your program actually performs.