# DC3 Suffix Array Sort

# DC3 Suffix Array Sort

DC3 suffix array sort is the concrete formulation of the skew algorithm based on a difference cover modulo 3. It builds the suffix array in linear time by sorting a subset of suffixes, assigning them ranks, and merging with the remaining suffixes.

It is widely used in high-performance string processing systems.

## Problem

Given a string $S$ of length $n$, construct the suffix array:

$$
SA = \text{sorted indices of all suffixes of } S
$$

A sentinel character smaller than all others is appended to ensure all suffixes are comparable.

## Idea

Partition suffix indices into:

* $i \bmod 3 = 0$
* $i \bmod 3 = 1$
* $i \bmod 3 = 2$

Then:

1. sort suffixes at positions $i \bmod 3 \ne 0$ using triples
2. assign ranks to these suffixes
3. recursively solve if ranks are not unique
4. sort suffixes at positions $i \bmod 3 = 0$
5. merge both sorted groups

## Triples Sorting

For positions where $i \bmod 3 \ne 0$, compare suffixes using:

$$
(S[i], S[i+1], S[i+2])
$$

These triples are sorted with radix sort in linear time.

## Algorithm

```text id="d3k7xp"
dc3_suffix_array(S):
    append sentinel symbols

    construct list A of indices i where i mod 3 != 0

    sort A by triples (S[i], S[i+1], S[i+2]) using radix sort

    assign ranks to A

    if ranks not unique:
        recursively compute SA for reduced problem

    construct list B of indices i where i mod 3 == 0

    sort B using pairs (S[i], rank[i+1])

    merge A and B to form final SA
```

## Merge Comparison

During merge:

* if comparing $i \bmod 3 = 0$ with $j \bmod 3 = 1$:

$$
(S[i], \text{rank}[i+1])
$$

* if comparing $i \bmod 3 = 0$ with $j \bmod 3 = 2$:

$$
(S[i], S[i+1], \text{rank}[i+2])
$$

These comparisons fully determine suffix order.

## Example

Let:

$$
S = \text{"banana\$"}
$$

Suffix array:

$$
[6, 5, 3, 1, 0, 4, 2]
$$

This is produced after sorting non-multiples of 3, ranking them, sorting multiples of 3, and merging.

## Correctness

Sorting the subset of suffixes provides a correct ordering for those positions. Assigned ranks encode lexicographic order. Sorting the remaining suffixes uses these ranks to compare suffixes beyond the first few characters.

The merge step combines both lists while maintaining global order. Since every suffix is compared using sufficient prefix information and ranks, the final array is correctly sorted.

## Complexity

Let $n$ be the string length.

Time:

$$
O(n)
$$

Each stage uses linear-time radix sorting or linear merging.

Space:

$$
O(n)
$$

## Properties

| property         | value              |
| ---------------- | ------------------ |
| stable           | yes in radix steps |
| in place         | no                 |
| comparison based | no                 |
| time             | linear             |
| method           | divide and conquer |

## When to Use

Use DC3 when you need a theoretically optimal suffix array algorithm and are comfortable with its implementation complexity.

Avoid it in simpler applications where suffix array doubling or SA-IS provides easier implementations with acceptable performance.

## Notes

* Equivalent to skew algorithm
* Requires careful index handling
* Efficient in both theory and practice
* Often used in academic implementations

