Adaptive stable sorting algorithm that combines merge sort with run detection and insertion sort.
Timsort is a hybrid stable sorting algorithm designed for real world data. It combines ideas from merge sort and insertion sort, with strong emphasis on exploiting existing order.
It is the default sorting algorithm in languages such as Python and Java (for objects). Its design focuses on practical performance rather than purely theoretical bounds.
Problem
Given an array of length , reorder it such that:
while preserving stability.
Core Ideas
Timsort builds on three key ideas:
| idea | purpose |
|---|---|
| run detection | identify already sorted segments |
| insertion sort | efficiently extend short runs |
| merge strategy | merge runs with size invariants |
Run
A run is a monotonic sequence:
- increasing:
- or decreasing (reversed to increasing)
Short runs are extended to a minimum size using insertion sort.
Stack of Runs
Runs are pushed onto a stack. Timsort enforces size constraints between runs to ensure efficient merging.
Typical invariant:
for the top runs on the stack.
Algorithm
timsort(A):
runs = []
while not at end of A:
run = detect_run(A)
if length(run) < min_run:
extend run using insertion sort
push run onto stack
maintain_merge_invariants(runs)
merge all runs on stackMerge invariants ensure that runs are merged in a balanced way, avoiding worst case behavior.
Example
Let:
Detected runs:
- [1, 2, 3, 7]
- [6, 5] → reversed → [5, 6]
- [8, 9]
Stack:
| run | size |
|---|---|
| [1,2,3,7] | 4 |
| [5,6] | 2 |
| [8,9] | 2 |
Merging proceeds based on invariants:
Final result:
Correctness
Each run is sorted by construction. The merge step preserves order and stability. The stack invariants guarantee that merges occur in a structured way, preventing unbalanced recursion patterns. By induction over the merge process, the final array is sorted.
Complexity
| case | time |
|---|---|
| best (already sorted) | |
| worst | |
| average |
Space:
Properties
| property | value |
|---|---|
| stable | yes |
| adaptive | yes |
| in-place | no |
| practical performance | excellent |
Notes
Timsort is highly optimized for partially sorted data. It minimizes comparisons and memory movement in typical workloads.
A key optimization is galloping mode, where merging switches to exponential search when one run dominates comparisons.
This algorithm is widely used in standard libraries due to its strong real world performance.
Implementation
A full Timsort implementation is large. The following simplified version shows the structure without all optimizations.
def timsort(a):
min_run = 32
n = len(a)
def insertion_sort(left, right):
for i in range(left + 1, right + 1):
key = a[i]
j = i - 1
while j >= left and a[j] > key:
a[j + 1] = a[j]
j -= 1
a[j + 1] = key
size = min_run
for start in range(0, n, size):
end = min(start + size - 1, n - 1)
insertion_sort(start, end)
size = min_run
while size < n:
for left in range(0, n, 2 * size):
mid = min(left + size - 1, n - 1)
right = min(left + 2 * size - 1, n - 1)
if mid < right:
merge(a, left, mid, right)
size *= 2
return a
def merge(a, l, mid, r):
left = a[l:mid+1]
right = a[mid+1:r+1]
i = j = 0
k = l
while i < len(left) and j < len(right):
if left[i] <= right[j]:
a[k] = left[i]
i += 1
else:
a[k] = right[j]
j += 1
k += 1
while i < len(left):
a[k] = left[i]
i += 1
k += 1
while j < len(right):
a[k] = right[j]
j += 1
k += 1