Skip to content

Sparse Array

Store only non-default values from a large logical array.

A sparse array represents a large logical array by storing only positions whose values differ from a default value. Most entries are implicit.

You use it when the index space is large but only a small number of positions contain meaningful data.

Problem

Given a logical array AA of length nn, support reading and writing index ii where:

0i<n 0 \le i < n

Most values are equal to a default value, such as 00.

Structure

A sparse array stores:

  • length
  • default value
  • map from index to stored value

For example, the logical array:

[0,0,8,0,0,3,0] [0, 0, 8, 0, 0, 3, 0]

can be stored as:

indexvalue
28
53

All missing indices return the default value.

Algorithm

Read an index:

get(S, i):
    if i < 0 or i >= S.length:
        error "index out of bounds"

    if i exists in S.values:
        return S.values[i]

    return S.default

Write an index:

set(S, i, x):
    if i < 0 or i >= S.length:
        error "index out of bounds"

    if x == S.default:
        remove i from S.values
    else:
        S.values[i] = x

Example

Let the sparse array have length 77 and default value 00.

Apply:

S[2]=8 S[2] = 8

and

S[5]=3 S[5] = 3

Stored entries:

indexvalue
28
53

Query results:

queryresult
S[2]8
S[4]0
S[5]3

Correctness

The map stores exactly the indices whose values differ from the default. If an index exists in the map, returning the mapped value gives the explicitly assigned value. If it does not exist, the index has no stored override, so returning the default value gives the logical array value.

When setting an index to the default value, removing the index preserves this invariant.

Complexity

operationexpected time
getO(1)O(1)
setO(1)O(1)
remove default valueO(1)O(1)

Space usage is:

O(m) O(m)

where mm is the number of non-default entries.

When to Use

Sparse arrays are appropriate when:

  • the logical length is large
  • most values are default values
  • random access is needed
  • memory usage matters more than contiguous layout

They are less suitable when most indices are populated. In that case, a dense array is simpler and faster.

Implementation

class SparseArray:
    def __init__(self, length, default=0):
        self.length = length
        self.default = default
        self.values = {}

    def get(self, i):
        if i < 0 or i >= self.length:
            raise IndexError("index out of bounds")
        return self.values.get(i, self.default)

    def set(self, i, x):
        if i < 0 or i >= self.length:
            raise IndexError("index out of bounds")

        if x == self.default:
            self.values.pop(i, None)
        else:
            self.values[i] = x
type SparseArray[T comparable] struct {
    length  int
    zero    T
    values  map[int]T
}

func NewSparseArray[T comparable](length int, zero T) *SparseArray[T] {
    return &SparseArray[T]{
        length: length,
        zero:   zero,
        values: make(map[int]T),
    }
}

func (s *SparseArray[T]) Get(i int) (T, bool) {
    if i < 0 || i >= s.length {
        var zero T
        return zero, false
    }

    v, ok := s.values[i]
    if ok {
        return v, true
    }

    return s.zero, true
}

func (s *SparseArray[T]) Set(i int, x T) bool {
    if i < 0 || i >= s.length {
        return false
    }

    if x == s.zero {
        delete(s.values, i)
    } else {
        s.values[i] = x
    }

    return true
}