# DC7 Suffix Array Sort

# DC7 Suffix Array Sort

DC7 is a generalization of DC3 suffix array construction. It splits suffix positions into residue classes modulo 7 and recursively sorts a subset of suffixes, then induces the remaining order.

The goal is to reduce comparison depth by encoding suffixes into fixed length tuples and merging sorted partitions.

## Problem

Given a string $T$ of length $n$, construct its suffix array:

$$
SA[0 \dots n-1]
$$

such that suffixes are sorted lexicographically.

## Key Idea

Partition suffix positions by modulo 7:

| group | positions   |
| ----- | ----------- |
| $S_0$ | i mod 7 = 0 |
| $S_1$ | i mod 7 = 1 |
| $S_2$ | i mod 7 = 2 |
| $S_3$ | i mod 7 = 3 |
| $S_4$ | i mod 7 = 4 |
| $S_5$ | i mod 7 = 5 |
| $S_6$ | i mod 7 = 6 |

DC7 sorts a subset of these positions recursively, typically all non-zero classes, then reconstructs full ordering.

## Algorithm

### Step 1: Build tuples

Each position is represented by a fixed length tuple of characters:

$$
(T[i], T[i+1], \dots, T[i+6])
$$

### Step 2: Sort selected groups

Sort positions in selected residue classes using radix or comparison sort.

### Step 3: Assign ranks

Convert sorted tuples into integer ranks.

### Step 4: Recursive reduction

If ranks are not unique, recursively apply DC7 to the reduced problem.

### Step 5: Induce remaining suffixes

Use sorted subset to infer ordering of remaining suffixes.

### Step 6: Merge

Combine all residue classes into final suffix array.

```text id="dc70k1"
dc7(T):
    split positions by mod 7

    build 7-length tuples

    sort selected residue classes

    assign ranks

    if ranks not unique:
        recurse on reduced string

    induce remaining suffixes

    merge all groups

    return SA
```

## Example

Text:

$$
T = "banana"
$$

Positions:

|  i | char |
| -: | ---- |
|  0 | b    |
|  1 | a    |
|  2 | n    |
|  3 | a    |
|  4 | n    |
|  5 | a    |

Residue groups:

| group | positions |
| ----- | --------- |
| S0    | 0         |
| S1    | 1         |
| S2    | 2         |
| S3    | 3         |
| S4    | 4         |
| S5    | 5         |
| S6    | none      |

After tuple ranking and merging, the final suffix array becomes:

| SA index | suffix |
| -------: | ------ |
|        0 | a      |
|        1 | ana    |
|        2 | anana  |
|        3 | banana |
|        4 | na     |
|        5 | nana   |

## Correctness

DC7 extends the DC3 principle: suffixes are encoded into fixed length tuples that preserve lexicographic order locally.

Recursive ranking ensures uniqueness of comparisons. Induced sorting reconstructs global order by ensuring that every suffix is placed relative to already sorted representatives.

The merge step preserves correctness because all comparisons reduce to rank comparisons or bounded character comparisons over fixed windows.

Thus the final array is a correct suffix array.

## Complexity

Let $n = |T|$.

| phase               | cost   |
| ------------------- | ------ |
| tuple construction  | $O(n)$ |
| recursive sorting   | $O(n)$ |
| induction and merge | $O(n)$ |

Total time:

$$
O(n)
$$

Space complexity:

$$
O(n)
$$

## When to Use

DC7 is primarily theoretical and used for:

* generalized divide and conquer suffix array construction
* research into linear time string algorithms
* extensions of DC3 style methods
* experimenting with partition sizes beyond 3

In practice, SA-IS is preferred for simplicity and performance.

## Implementation

```python id="dc71m7"
def dc7(T):
    # Simplified DC7 skeleton

    def build_tuples(T, k=7):
        pass

    def sort_groups(groups):
        pass

    def assign_ranks(sorted_groups):
        pass

    def induce_all(groups):
        pass

    groups = build_tuples(T)

    sorted_groups = sort_groups(groups)

    ranks = assign_ranks(sorted_groups)

    if len(set(ranks)) != len(ranks):
        ranks = dc7(ranks)

    sa = induce_all(ranks)

    return sa
```

```go id="dc72v9"
func DC7(text []int) []int {
	// Simplified skeleton

	buildTuples := func() []int { return nil }
	sortGroups := func([]int) []int { return nil }
	assignRanks := func([]int) []int { return nil }
	induce := func([]int) []int { return nil }

	groups := buildTuples()
	sorted := sortGroups(groups)
	ranks := assignRanks(sorted)

	_ = ranks

	sa := induce(ranks)

	return sa
}
```

