# Skew Suffix Array Sort (DC3)

# Skew Suffix Array Sort (DC3)

DC3, also called skew algorithm, builds a suffix array in linear time by splitting suffix positions into residue classes modulo 3 and recursively sorting a subset.

It reduces the full suffix ordering problem into a smaller problem, then merges results using rank comparisons.

## Problem

Given a string $T$ of length $n$, construct its suffix array:

$$
SA[0 \dots n-1]
$$

such that all suffixes are sorted lexicographically.

## Key Idea

Split suffixes into three groups:

| group | positions   |
| ----- | ----------- |
| $S_0$ | i mod 3 = 0 |
| $S_1$ | i mod 3 = 1 |
| $S_2$ | i mod 3 = 2 |

Sort $S_1 \cup S_2$ recursively, then induce order for $S_0$.

## Algorithm

### Step 1: Form triplets

For each position $i$:

$$
(T[i], T[i+1], T[i+2])
$$

### Step 2: Sort triplets

Sort all positions where $i \mod 3 \neq 0$.

### Step 3: Assign ranks

Assign lexicographic ranks to triplets.

### Step 4: Recursive step

If ranks are not unique, recurse on reduced string.

### Step 5: Sort $S_0$

Use sorted $S_1$ and $S_2$ to sort $S_0$ using rank comparisons.

### Step 6: Merge

Merge sorted $S_0$ with sorted $S_1 \cup S_2$.

```text id="dc30k1"
dc3(T):
    split positions into S1 and S2

    sort S1 ∪ S2 by triplets

    assign ranks

    if ranks not unique:
        recurse

    sort S0 using ranks

    merge S0 and S1∪S2

    return SA
```

## Example

Text:

$$
T = "banana"
$$

Positions:

|  i | char |
| -: | ---- |
|  0 | b    |
|  1 | a    |
|  2 | n    |
|  3 | a    |
|  4 | n    |
|  5 | a    |

Groups:

| group | positions |
| ----- | --------- |
| S0    | 0, 3      |
| S1    | 1, 4      |
| S2    | 2, 5      |

After sorting and ranking, DC3 merges all suffixes into:

| SA index | suffix |
| -------: | ------ |
|        0 | a      |
|        1 | ana    |
|        2 | anana  |
|        3 | banana |
|        4 | na     |
|        5 | nana   |

## Correctness

DC3 works by reducing suffix comparison to fixed length tuples and ensuring lexicographic order through recursive ranking.

Suffixes in $S_1 \cup S_2$ are sorted recursively, giving stable ranks. These ranks act as compressed keys for comparing suffixes in $S_0$.

The final merge step preserves ordering because each suffix comparison is reduced to rank comparison or direct character comparison in a bounded window.

Thus the output is a correct suffix array.

## Complexity

Let $n = |T|$.

| phase             | cost   |
| ----------------- | ------ |
| triplet formation | $O(n)$ |
| recursive sorting | $O(n)$ |
| merging           | $O(n)$ |

Total time:

$$
O(n)
$$

Space complexity:

$$
O(n)
$$

## When to Use

DC3 is useful when:

* linear time suffix array construction is required
* SA-IS is not preferred due to implementation complexity
* theoretical guarantees are important
* working with academic or algorithmic research systems

In practice, SA-IS is often faster and simpler, but DC3 remains a classic linear construction method.

## Implementation

```python id="dc31m7"
def dc3(T):
    # Simplified structure of DC3 algorithm

    def sort_triplets(T, positions):
        pass

    def assign_ranks(sorted_positions):
        pass

    def merge(sa0, sa12):
        pass

    S1S2 = [i for i in range(len(T)) if i % 3 != 0]
    SA12 = sort_triplets(T, S1S2)
    ranks = assign_ranks(SA12)

    if len(set(ranks)) != len(ranks):
        SA12 = dc3(ranks)

    SA0 = []

    SA = merge(SA0, SA12)

    return SA
```

```go id="dc32v8"
func DC3(text []int) []int {
	// Simplified skeleton

	sortTriplets := func() []int { return nil }
	assignRanks := func([]int) []int { return nil }
	merge := func([]int, []int) []int { return nil }

	s1s2 := []int{}

	for i := 0; i < len(text); i++ {
		if i%3 != 0 {
			s1s2 = append(s1s2, i)
		}
	}

	sa12 := sortTriplets()
	ranks := assignRanks(sa12)

	_ = ranks

	sa0 := []int{}

	sa := merge(sa0, sa12)

	return sa
}
```

