19.19 Online Algorithms
Most algorithms assume complete knowledge of the input before computation begins.
19.19 Online Algorithms
Most algorithms assume complete knowledge of the input before computation begins. Sorting algorithms see the entire array. Graph algorithms receive the complete graph. Dynamic programming algorithms assume all problem data is available from the start.
Many real systems do not operate this way.
A web server receives requests one at a time. A scheduler receives jobs continuously. A network router sees packets as they arrive. A cloud platform receives resource requests from users who have not yet appeared.
These systems must make decisions without knowing the future.
Algorithms designed for this environment are called online algorithms.
Problem
Suppose requests arrive sequentially:
Request 1
Request 2
Request 3
...
When processing:
Request 2
the algorithm has not yet seen:
Request 3
Request 4
Request 5
...
A decision must be made immediately.
Future information is unavailable.
How should performance be evaluated?
Online vs Offline Algorithms
An offline algorithm sees the entire input:
Input
→
Analyze everything
→
Compute solution
An online algorithm sees:
Input item
→
Decision
Input item
→
Decision
Input item
→
Decision
Each decision is typically irreversible.
The inability to see future events creates the central challenge.
Example: Scheduling
Suppose jobs arrive over time.
Job:
Duration = 2
arrives.
The scheduler assigns a machine.
Later:
Duration = 100
arrives.
A different earlier decision might have been better.
Unfortunately:
The future was unknown.
Online algorithms must operate under this uncertainty.
Competitive Analysis
Approximation algorithms compare themselves against:
OPT
the optimal solution.
Online algorithms use a related concept.
The comparison is against an optimal offline algorithm that knows the future.
This benchmark is intentionally powerful.
It represents the best possible decision-maker.
Competitive Ratio
For minimization problems:
Algorithm Cost
----------------
Offline Optimal Cost
defines the competitive ratio.
An algorithm is:
c-competitive
if:
ALG ≤ c · OPT
for every input sequence.
Example:
ALG = 150
OPT = 100
Ratio:
1.5
The online algorithm spends at most 50% more than a clairvoyant optimal scheduler.
Why Competitive Analysis Matters
Suppose two online algorithms produce:
| Algorithm | Cost |
|---|---|
| A | 150 |
| B | 200 |
Without context, the numbers mean little.
If:
OPT = 100
then:
| Algorithm | Competitive Ratio |
|---|---|
| A | 1.5 |
| B | 2.0 |
Now the quality becomes clear.
Competitive analysis provides a principled way to evaluate online decision-making.
The Ski Rental Problem
The ski rental problem is one of the most famous examples in online algorithms.
You may:
Rent skis:
$1 per day
or:
Buy skis:
$100
Unknown information:
How many days
will you ski?
The future determines the best choice.
Offline Optimum
If skiing lasts:
20 days
optimal decision:
Rent
Cost:
20
If skiing lasts:
200 days
optimal decision:
Buy immediately
Cost:
100
The offline algorithm knows the future and chooses perfectly.
Deterministic Strategy
Simple online strategy:
Rent until
rental cost reaches purchase cost.
Then buy.
For ski cost:
Purchase = 100
rent for:
100 days
then buy.
Analysis
Worst case:
You rent:
100 days
Cost:
100
Then buy:
100
Total:
200
Suppose skiing stops immediately after purchase.
Offline optimum:
100
Competitive ratio:
200 / 100 = 2
Therefore:
2-competitive
Lower Bound
Remarkably:
No deterministic online strategy
can achieve
better than 2.
The simple strategy is optimal among deterministic algorithms.
This result illustrates a recurring theme:
Simple strategy
+
Strong analysis
often produces important online algorithms.
Randomized Ski Rental
Randomization improves performance.
Instead of buying exactly at day:
100
choose a random purchase day according to a carefully designed distribution.
Expected competitive ratio becomes:
e / (e − 1)
≈ 1.582
This is substantially better than:
2
Randomization again helps overcome uncertainty.
Caching
Caching provides another classic online problem.
Cache capacity:
k pages
Requests:
A
B
C
A
D
...
When the cache is full, a page must be evicted.
The challenge:
Which page?
Future requests are unknown.
Optimal Offline Caching
An optimal offline algorithm exists.
Rule:
Evict the page
whose next use
is farthest in the future.
This algorithm is called Belady's algorithm.
It is optimal.
Unfortunately:
Requires future knowledge.
Therefore it cannot be implemented online.
Least Recently Used (LRU)
A practical online strategy:
Evict the page
least recently accessed.
Implementation:
from collections import OrderedDict
class LRUCache:
def __init__(self, capacity):
self.capacity = capacity
self.cache = OrderedDict()
def access(self, key):
if key in self.cache:
self.cache.move_to_end(key)
return True
if len(self.cache) >= self.capacity:
self.cache.popitem(last=False)
self.cache[key] = True
return False
This idea appears in operating systems, databases, browsers, and storage engines.
Paging Competitiveness
For cache size:
k
LRU achieves:
k-competitive
performance.
This means:
Faults(LRU)
≤
k × Faults(OPT)
for all request sequences.
The bound is pessimistic but important.
It provides a rigorous guarantee.
Server Allocation
Suppose:
k servers
must serve incoming requests at locations.
Requests arrive online.
Moving a server incurs cost.
The algorithm must decide:
Which server moves?
without seeing future requests.
This problem leads to the famous:
k-server problem
one of the central topics in online algorithms.
Load Balancing
Jobs arrive continuously.
Available machines:
M1
M2
M3
...
Online rule:
Assign job
to least-loaded machine.
This simple strategy often achieves strong competitive guarantees.
Many cloud scheduling systems rely on variants of this idea.
Online Bin Packing
Items arrive sequentially.
Bin capacity:
10
Items:
4
3
5
2
...
When an item arrives, it must immediately be assigned to a bin.
Future items are unknown.
Common strategies:
- First Fit
- Best Fit
- Next Fit
All have well-studied competitive ratios.
Randomized Online Algorithms
Randomization often improves online performance.
Benefits:
Adversaries cannot
predict decisions.
Examples:
- Randomized caching
- Randomized routing
- Randomized load balancing
- Randomized ski rental
Many lower bounds for deterministic algorithms disappear when randomness is allowed.
Adversarial Inputs
Online analysis usually assumes an adversary chooses the input sequence.
This creates strong guarantees.
If an algorithm performs well against an adversary:
It typically performs
very well in practice.
The adversarial model explains why competitive analysis is often conservative.
Resource Augmentation
Sometimes an online algorithm receives extra resources.
Example:
Online cache:
capacity k + 1
Offline cache:
capacity k
This model is called resource augmentation.
The extra resource can dramatically improve competitive ratios.
Modern scheduling theory frequently uses this framework.
Real-World Applications
Online algorithms appear everywhere.
Operating Systems
- Page replacement
- CPU scheduling
- Memory allocation
Networks
- Routing
- Congestion control
- Packet scheduling
Cloud Platforms
- VM placement
- Autoscaling
- Load balancing
Databases
- Buffer management
- Query scheduling
- Cache replacement
Finance
- Trading decisions
- Portfolio rebalancing
- Risk management
Most production systems are fundamentally online systems.
Comparison with Approximation Algorithms
| Property | Approximation | Online |
|---|---|---|
| Future known | Yes | No |
| Benchmark | Optimal solution | Optimal offline solution |
| Guarantee | Approximation ratio | Competitive ratio |
| Randomization | Optional | Often valuable |
| Input model | Static | Dynamic |
Both frameworks compare against optimal solutions.
The source of difficulty differs.
Common Mistakes
Comparing Against Other Online Algorithms
Competitive analysis compares against:
Optimal offline
not another heuristic.
Ignoring Adversarial Sequences
Worst-case inputs determine guarantees.
Assuming Random Inputs
Competitive analysis does not require random arrivals.
Forgetting Irreversibility
Online decisions are often permanent.
This constraint fundamentally changes algorithm design.
Discussion
Online algorithms formalize decision-making under uncertainty. Unlike traditional algorithms, they must act before all information is available. Competitive analysis provides a rigorous framework for measuring performance by comparing against an optimal offline algorithm with complete knowledge of the future.
The ski rental problem, caching, paging, scheduling, routing, and load balancing all demonstrate the same central challenge: balancing current costs against unknown future opportunities. Randomization frequently improves performance, and many of the most important online algorithms combine probabilistic techniques with competitive analysis.
The next section explores streaming algorithms, where data arrives online at massive scale and memory constraints become just as important as uncertainty about future inputs.