13.9 Longest Increasing Subsequence
The Longest Increasing Subsequence (LIS) problem is one of the most important sequence optimization problems in algorithm design.
13.9 Longest Increasing Subsequence
The Longest Increasing Subsequence (LIS) problem is one of the most important sequence optimization problems in algorithm design. At first glance, the problem appears deceptively simple: find the longest subsequence whose elements appear in strictly increasing order. Yet beneath this simplicity lies a rich collection of dynamic programming techniques, optimization strategies, and algorithmic insights.
LIS is particularly valuable because it demonstrates how dynamic programming can be used to reason about relationships between positions in a sequence rather than prefixes of a sequence. Unlike the Longest Common Subsequence problem, which compares two strings, LIS operates entirely within a single sequence. The resulting state space is smaller, but the transition structure is often more subtle.
The problem also serves as an excellent example of algorithmic refinement. A straightforward dynamic programming solution requires quadratic time. With additional observations and data structures, the same problem can be solved in (O(n \log n)) time. Understanding both approaches provides insight into how dynamic programming and data structures frequently complement one another.
Problem
Given an array:
10 9 2 5 3 7 101 18
find the length of the longest strictly increasing subsequence.
One valid solution is:
2 3 7 18
The answer length is:
4
Elements do not need to be adjacent.
Only relative order must be preserved.
Understanding the Search Space
A brute-force approach examines every subsequence.
For an array of length:
$$ n $$
there are:
$$ 2^n $$
possible subsequences.
Most are irrelevant.
Many violate the increasing-order requirement.
Even for moderate values of (n), exhaustive search becomes impractical.
The challenge is to exploit overlapping structure among partial solutions.
Designing the State
The key observation is that an increasing subsequence ends at a specific position.
Define:
$$ dp[i] $$
as:
The length of the longest increasing subsequence ending at position (i).
For the array:
10 9 2 5 3 7
the state:
dp[3]
asks:
Longest increasing subsequence
that must end at value 5
This definition transforms a global optimization problem into a collection of local optimization problems.
Understanding the Transition
Consider position:
$$ i $$
with value:
$$ a[i] $$
Which earlier positions can precede it?
Any position:
$$ j < i $$
such that:
$$ a[j] < a[i] $$
For each valid predecessor:
$$ dp[j] $$
can be extended by the current element.
Therefore:
$$ dp[i] = 1 + \max(dp[j]) $$
for all valid predecessors.
If no predecessor exists:
$$ dp[i] = 1 $$
because the element itself forms an increasing subsequence of length one.
Working Through an Example
Consider:
10 9 2 5 3 7
Initially:
| Position | Value | DP |
|---|---|---|
| 0 | 10 | 1 |
| 1 | 9 | 1 |
| 2 | 2 | 1 |
Now examine:
5
Valid predecessors:
2
Therefore:
$$ dp = 2 $$
For:
3
Valid predecessors:
2
Again:
$$ dp = 2 $$
For:
7
Valid predecessors:
2
5
3
Best predecessor length:
$$ 2 $$
Thus:
$$ dp = 3 $$
The completed table becomes:
| Value | DP |
|---|---|
| 10 | 1 |
| 9 | 1 |
| 2 | 1 |
| 5 | 2 |
| 3 | 2 |
| 7 | 3 |
The maximum value in the table is the answer.
Implementation
func LIS(nums []int) int {
n := len(nums)
dp := make([]int, n)
answer := 0
for i := 0; i < n; i++ {
dp[i] = 1
for j := 0; j < i; j++ {
if nums[j] < nums[i] {
dp[i] = max(
dp[i],
dp[j]+1,
)
}
}
answer = max(answer, dp[i])
}
return answer
}
The implementation follows the recurrence directly.
Visualizing Dependencies
Unlike LCS or grid problems, LIS does not use a rectangular table.
Dependencies form a directed graph.
Consider:
2 5 3 7
Valid transitions:
2 -> 5
2 -> 3
2 -> 7
5 -> 7
3 -> 7
Graphically:
2
| \\n| \\nv v
5 3
\ /
v
7
The dynamic programming solution computes the longest path ending at each node.
This interpretation is useful because many sequence problems can be viewed as longest-path problems in directed acyclic graphs.
Reconstructing the Sequence
The length alone is often insufficient.
To reconstruct the actual subsequence, store the predecessor responsible for each update.
parent[i] = j
whenever:
dp[j] + 1 > dp[i]
After processing all positions:
- Find the position with maximum DP value.
- Follow parent links backward.
- Reverse the collected sequence.
This produces one valid longest increasing subsequence.
Why the Quadratic Solution Works
Every increasing subsequence ending at position:
$$ i $$
must end with some predecessor:
$$ j $$
where:
$$ a[j] < a[i] $$
The recurrence examines every such predecessor.
No valid solution is ignored.
No invalid solution is introduced.
This exhaustive but structured exploration guarantees correctness.
Space Complexity
State count:
$$ n $$
Memory usage:
$$ O(n) $$
The predecessor array also requires:
$$ O(n) $$
space.
Unlike many two-dimensional dynamic programming problems, memory is rarely the bottleneck.
The challenge is running time.
Time Complexity
For every position:
$$ i $$
the algorithm examines all earlier positions:
$$ 0 \ldots i-1 $$
Total work:
$$ 1 + 2 + 3 + \cdots + (n-1) $$
which equals:
$$ O(n^2) $$
For arrays containing a few thousand elements, this is usually acceptable.
For millions of elements, it becomes impractical.
Toward an (O(n \log n)) Solution
The quadratic algorithm repeatedly asks:
What is the best increasing subsequence that can be extended by the current value?
Instead of examining every previous position, we can maintain summary information about the best subsequences discovered so far.
Consider:
2 5 3 7
After processing:
2
store:
length 1 -> smallest ending value 2
After processing:
5
store:
length 2 -> smallest ending value 5
After processing:
3
replace:
length 2 -> smallest ending value 3
because:
3 < 5
and a smaller ending value creates more future extension opportunities.
This observation leads to the classic patience-sorting-based solution.
The Tails Array
Maintain:
tails[k]
Meaning:
The smallest possible ending value of an increasing subsequence of length (k+1).
For example:
tails = [2, 3, 7]
indicates:
length 1 ends with 2
length 2 ends with 3
length 3 ends with 7
The array remains sorted.
Binary search can locate update positions efficiently.
Optimized Implementation
func LIS(nums []int) int {
tails := []int{}
for _, x := range nums {
left := 0
right := len(tails)
for left < right {
mid := (left + right) / 2
if tails[mid] < x {
left = mid + 1
} else {
right = mid
}
}
if left == len(tails) {
tails = append(tails, x)
} else {
tails[left] = x
}
}
return len(tails)
}
Time complexity:
$$ O(n \log n) $$
Memory complexity:
$$ O(n) $$
Common Mistakes
Confusing Subsequences and Subarrays
Subsequences may skip elements.
Subarrays must remain contiguous.
The algorithms are entirely different.
Using Non-Strict Comparisons
The classical LIS problem requires:
$$ a[j] < a[i] $$
Using:
$$ a[j] \le a[i] $$
changes the problem definition.
Returning the Last State
The answer is:
$$ \max(dp[i]) $$
not:
$$ dp[n-1] $$
The longest subsequence may end anywhere.
Incorrect Binary Search Updates
The optimized algorithm depends on maintaining the smallest possible ending value for each length.
Replacing the wrong position breaks the invariant.
Applications
Longest increasing subsequence techniques appear in many domains:
- version comparison
- stock trend analysis
- scheduling optimization
- genome sequence analysis
- event stream processing
- ranking systems
- computational geometry
- partial-order optimization
Many seemingly unrelated problems can be transformed into LIS formulations.
Takeaway
The Longest Increasing Subsequence problem demonstrates how dynamic programming can transform a global sequence optimization problem into a collection of local decisions. The classical (O(n^2)) solution introduces the important pattern of states that represent solutions ending at specific positions. The optimized (O(n \log n)) solution illustrates a broader lesson: dynamic programming often provides the structural insight needed to discover even faster algorithms. Understanding both approaches is essential because the same progression from straightforward dynamic programming to data-structure-assisted optimization appears throughout advanced algorithm design.