# Array Padding

# Array Padding

Array padding inserts unused bytes between elements or groups of elements. It adjusts layout to satisfy alignment constraints or to avoid contention such as false sharing.

You use it when memory layout affects performance or correctness in concurrent settings.

## Problem

Given elements of size $s$ and required alignment or spacing $p$, place elements so that consecutive elements are separated by a stride:

$$
stride \ge s
$$

and often:

$$
stride \bmod p = 0
$$

## Structure

A padded array uses a stride larger than the element size.

Logical index $i$ maps to:

$$
address(A[i]) = base(A) + i \cdot stride
$$

Unused bytes in each stride form padding.

## Algorithm

Compute padded addresses and access elements.

```id="array-padding-address"
address(base, stride, i):
    return base + i * stride
```

Access with padding:

```id="array-padding-get"
get(A, base, stride, i):
    if i < 0 or i >= length(A):
        error "index out of bounds"

    p = base + i * stride
    return load(p)
```

## Example

Suppose:

| field           |    value |
| --------------- | -------: |
| element size    |  8 bytes |
| desired spacing | 64 bytes |

Then:

$$
stride = 64
$$

Addresses:

| index |    address |
| ----: | ---------: |
|     0 |       base |
|     1 |  base + 64 |
|     2 | base + 128 |
|     3 | base + 192 |

Each element occupies 8 bytes, with 56 bytes of padding.

## Correctness

Each logical index maps to a unique physical region starting at $base + i \cdot stride$. Since $stride \ge s$, the memory regions for distinct indices do not overlap.

Therefore, reading or writing at index $i$ accesses only that element's storage.

## Complexity

| operation |   time |
| --------- | -----: |
| get       | $O(1)$ |
| set       | $O(1)$ |

Space usage:

$$
O(n \cdot stride)
$$

which may be significantly larger than $O(n \cdot s)$.

## When to Use

Array padding is appropriate when:

* alignment requirements must be enforced
* avoiding false sharing between threads
* improving cache line utilization for independent data
* interfacing with hardware or SIMD requirements

It is less suitable when:

* memory usage must be minimal
* large padding wastes too much space
* data is accessed sequentially and compact layout is preferable

## Implementation

```python id="array-padding-python"
class PaddedArray:
    def __init__(self, n, stride):
        self.n = n
        self.stride = stride
        self.data = bytearray(n * stride)

    def address(self, i):
        if i < 0 or i >= self.n:
            raise IndexError("index out of bounds")
        return i * self.stride

    def get_bytes(self, i, size):
        p = self.address(i)
        return self.data[p:p+size]

    def set_bytes(self, i, value_bytes):
        p = self.address(i)
        self.data[p:p+len(value_bytes)] = value_bytes
```

```go id="array-padding-go"
type PaddedArray struct {
    data   []byte
    stride int
    n      int
}

func NewPaddedArray(n, stride int) *PaddedArray {
    return &PaddedArray{
        data:   make([]byte, n*stride),
        stride: stride,
        n:      n,
    }
}

func (p *PaddedArray) address(i int) (int, bool) {
    if i < 0 || i >= p.n {
        return 0, false
    }
    return i * p.stride, true
}

func (p *PaddedArray) Get(i int, size int) ([]byte, bool) {
    addr, ok := p.address(i)
    if !ok {
        return nil, false
    }
    return p.data[addr : addr+size], true
}

func (p *PaddedArray) Set(i int, value []byte) bool {
    addr, ok := p.address(i)
    if !ok {
        return false
    }
    copy(p.data[addr:], value)
    return true
}
```

