# 2.16 Run-Length Encoding

Run-Length Encoding (RLE) compresses sequences by replacing consecutive equal values with a pair `(value, count)`. It is effective when the input contains long runs of the same element and serves as a simple pre-processing step for more complex compression or for speeding up certain algorithms.

## Problem

Given a sequence, produce its run-length encoding.

Example:

```text
[1, 1, 1, 2, 2, 3, 1, 1]
```

Encoding:

```text
[(1, 3), (2, 2), (3, 1), (1, 2)]
```

## Encoding

Scan the array once, tracking the current run.

```go
type Run struct {
	Value int
	Count int
}

func RLEncode(a []int) []Run {
	if len(a) == 0 {
		return nil
	}

	var out []Run
	cur := a[0]
	cnt := 1

	for i := 1; i < len(a); i++ {
		if a[i] == cur {
			cnt++
		} else {
			out = append(out, Run{cur, cnt})
			cur = a[i]
			cnt = 1
		}
	}

	out = append(out, Run{cur, cnt})
	return out
}
```

### Invariant

Before each iteration:

```text
cur and cnt describe the run of equal values ending at index i-1
out contains all completed runs before that
```

When a new value appears, the current run is closed and appended.

## Decoding

Reconstruct the original sequence.

```go
func RLDecode(runs []Run) []int {
	var out []int

	for _, r := range runs {
		for i := 0; i < r.Count; i++ {
			out = append(out, r.Value)
		}
	}

	return out
}
```

Invariant:

```text
After processing runs[0:k], out contains exactly the expansion of those runs
```

## String Encoding

For strings, store `(character, count)` pairs.

```go
type CharRun struct {
	Ch    byte
	Count int
}

func RLEncodeString(s string) []CharRun {
	if len(s) == 0 {
		return nil
	}

	var out []CharRun
	cur := s[0]
	cnt := 1

	for i := 1; i < len(s); i++ {
		if s[i] == cur {
			cnt++
		} else {
			out = append(out, CharRun{cur, cnt})
			cur = s[i]
			cnt = 1
		}
	}

	out = append(out, CharRun{cur, cnt})
	return out
}
```

For Unicode-aware encoding, operate on `[]rune` instead of bytes.

## Compressed String Form

Sometimes runs are stored as a string.

Example:

```text
"aaabbc" -> "a3b2c1"
```

```go
func RLECompress(s string) string {
	if len(s) == 0 {
		return ""
	}

	var out []byte
	cur := s[0]
	cnt := 1

	for i := 1; i < len(s); i++ {
		if s[i] == cur {
			cnt++
		} else {
			out = append(out, cur)
			out = append(out, []byte(strconv.Itoa(cnt))...)
			cur = s[i]
			cnt = 1
		}
	}

	out = append(out, cur)
	out = append(out, []byte(strconv.Itoa(cnt))...)

	return string(out)
}
```

Requires:

```go
import "strconv"
```

## Complexity

```text
Time:  O(n)
Space: O(k)
```

where `k` is the number of runs.

In the worst case (all elements different), `k = n`, so no compression is achieved.

## When RLE Is Effective

RLE works well when:

```text
long runs of identical values exist
```

Examples:

```text
binary images
monochrome data
repeated logs
sorted arrays with duplicates
```

It performs poorly on random data:

```text
[1, 2, 3, 4, 5] -> [(1,1),(2,1),(3,1),(4,1),(5,1)]
```

No compression, overhead increases.

## Using RLE in Algorithms

RLE can simplify problems by reducing input size.

### Example: Counting Groups

Instead of scanning all elements, count runs:

```go
func CountRuns(a []int) int {
	if len(a) == 0 {
		return 0
	}

	count := 1
	for i := 1; i < len(a); i++ {
		if a[i] != a[i-1] {
			count++
		}
	}
	return count
}
```

Equivalent to `len(RLEncode(a))`.

### Example: Maximum Run Length

```go
func MaxRun(a []int) int {
	if len(a) == 0 {
		return 0
	}

	best := 1
	cur := 1

	for i := 1; i < len(a); i++ {
		if a[i] == a[i-1] {
			cur++
			if cur > best {
				best = cur
			}
		} else {
			cur = 1
		}
	}

	return best
}
```

## In-Place RLE (Overwrite Input)

Some problems require writing compressed output into the same array.

Example: compress a byte slice.

```go
func CompressInPlace(a []byte) int {
	write := 0
	read := 0

	for read < len(a) {
		start := read

		for read < len(a) && a[read] == a[start] {
			read++
		}

		a[write] = a[start]
		write++

		count := read - start
		if count > 1 {
			for _, c := range strconv.Itoa(count) {
				a[write] = byte(c)
				write++
			}
		}
	}

	return write
}
```

Return value is the new length.

## Common Pitfalls

Forgetting to append the final run after the loop.

Mixing byte-based and rune-based processing for strings.

Using RLE when input has no repetition increases size instead of reducing it.

Incorrectly handling single-element runs when encoding to compact string form.

Overflow of count when runs are extremely long.

## Design Pattern

RLE uses a simple pattern:

1. Track current value
2. Count consecutive occurrences
3. Emit when value changes

The invariant always describes the current run.

## Takeaway

Run-length encoding compresses consecutive duplicates into compact `(value, count)` pairs. It is simple, linear, and effective when data contains repetition. Even when not used for compression, it provides a useful abstraction for reasoning about contiguous groups.

