19.19 Online Algorithms

Most algorithms assume complete knowledge of the input before computation begins.

19.19 Online Algorithms

Most algorithms assume complete knowledge of the input before computation begins. Sorting algorithms see the entire array. Graph algorithms receive the complete graph. Dynamic programming algorithms assume all problem data is available from the start.

Many real systems do not operate this way.

A web server receives requests one at a time. A scheduler receives jobs continuously. A network router sees packets as they arrive. A cloud platform receives resource requests from users who have not yet appeared.

These systems must make decisions without knowing the future.

Algorithms designed for this environment are called online algorithms.

Problem

Suppose requests arrive sequentially:

Request 1
Request 2
Request 3
...

When processing:

Request 2

the algorithm has not yet seen:

Request 3
Request 4
Request 5
...

A decision must be made immediately.

Future information is unavailable.

How should performance be evaluated?

Online vs Offline Algorithms

An offline algorithm sees the entire input:

Input
→
Analyze everything
→
Compute solution

An online algorithm sees:

Input item
→
Decision

Input item
→
Decision

Input item
→
Decision

Each decision is typically irreversible.

The inability to see future events creates the central challenge.

Example: Scheduling

Suppose jobs arrive over time.

Job:

Duration = 2

arrives.

The scheduler assigns a machine.

Later:

Duration = 100

arrives.

A different earlier decision might have been better.

Unfortunately:

The future was unknown.

Online algorithms must operate under this uncertainty.

Competitive Analysis

Approximation algorithms compare themselves against:

OPT

the optimal solution.

Online algorithms use a related concept.

The comparison is against an optimal offline algorithm that knows the future.

This benchmark is intentionally powerful.

It represents the best possible decision-maker.

Competitive Ratio

For minimization problems:

Algorithm Cost
----------------
Offline Optimal Cost

defines the competitive ratio.

An algorithm is:

c-competitive

if:

ALG ≤ c · OPT

for every input sequence.

Example:

ALG = 150
OPT = 100

Ratio:

1.5

The online algorithm spends at most 50% more than a clairvoyant optimal scheduler.

Why Competitive Analysis Matters

Suppose two online algorithms produce:

Algorithm Cost
A 150
B 200

Without context, the numbers mean little.

If:

OPT = 100

then:

Algorithm Competitive Ratio
A 1.5
B 2.0

Now the quality becomes clear.

Competitive analysis provides a principled way to evaluate online decision-making.

The Ski Rental Problem

The ski rental problem is one of the most famous examples in online algorithms.

You may:

Rent skis:
$1 per day

or:

Buy skis:
$100

Unknown information:

How many days
will you ski?

The future determines the best choice.

Offline Optimum

If skiing lasts:

20 days

optimal decision:

Rent

Cost:

20

If skiing lasts:

200 days

optimal decision:

Buy immediately

Cost:

100

The offline algorithm knows the future and chooses perfectly.

Deterministic Strategy

Simple online strategy:

Rent until
rental cost reaches purchase cost.

Then buy.

For ski cost:

Purchase = 100

rent for:

100 days

then buy.

Analysis

Worst case:

You rent:

100 days

Cost:

100

Then buy:

100

Total:

200

Suppose skiing stops immediately after purchase.

Offline optimum:

100

Competitive ratio:

200 / 100 = 2

Therefore:

2-competitive

Lower Bound

Remarkably:

No deterministic online strategy
can achieve
better than 2.

The simple strategy is optimal among deterministic algorithms.

This result illustrates a recurring theme:

Simple strategy
+
Strong analysis

often produces important online algorithms.

Randomized Ski Rental

Randomization improves performance.

Instead of buying exactly at day:

100

choose a random purchase day according to a carefully designed distribution.

Expected competitive ratio becomes:

e / (e − 1)
≈ 1.582

This is substantially better than:

2

Randomization again helps overcome uncertainty.

Caching

Caching provides another classic online problem.

Cache capacity:

k pages

Requests:

A
B
C
A
D
...

When the cache is full, a page must be evicted.

The challenge:

Which page?

Future requests are unknown.

Optimal Offline Caching

An optimal offline algorithm exists.

Rule:

Evict the page
whose next use
is farthest in the future.

This algorithm is called Belady's algorithm.

It is optimal.

Unfortunately:

Requires future knowledge.

Therefore it cannot be implemented online.

Least Recently Used (LRU)

A practical online strategy:

Evict the page
least recently accessed.

Implementation:

from collections import OrderedDict

class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self.cache = OrderedDict()

    def access(self, key):
        if key in self.cache:
            self.cache.move_to_end(key)
            return True

        if len(self.cache) >= self.capacity:
            self.cache.popitem(last=False)

        self.cache[key] = True
        return False

This idea appears in operating systems, databases, browsers, and storage engines.

Paging Competitiveness

For cache size:

k

LRU achieves:

k-competitive

performance.

This means:

Faults(LRU)
≤
k × Faults(OPT)

for all request sequences.

The bound is pessimistic but important.

It provides a rigorous guarantee.

Server Allocation

Suppose:

k servers

must serve incoming requests at locations.

Requests arrive online.

Moving a server incurs cost.

The algorithm must decide:

Which server moves?

without seeing future requests.

This problem leads to the famous:

k-server problem

one of the central topics in online algorithms.

Load Balancing

Jobs arrive continuously.

Available machines:

M1
M2
M3
...

Online rule:

Assign job
to least-loaded machine.

This simple strategy often achieves strong competitive guarantees.

Many cloud scheduling systems rely on variants of this idea.

Online Bin Packing

Items arrive sequentially.

Bin capacity:

10

Items:

4
3
5
2
...

When an item arrives, it must immediately be assigned to a bin.

Future items are unknown.

Common strategies:

  • First Fit
  • Best Fit
  • Next Fit

All have well-studied competitive ratios.

Randomized Online Algorithms

Randomization often improves online performance.

Benefits:

Adversaries cannot
predict decisions.

Examples:

  • Randomized caching
  • Randomized routing
  • Randomized load balancing
  • Randomized ski rental

Many lower bounds for deterministic algorithms disappear when randomness is allowed.

Adversarial Inputs

Online analysis usually assumes an adversary chooses the input sequence.

This creates strong guarantees.

If an algorithm performs well against an adversary:

It typically performs
very well in practice.

The adversarial model explains why competitive analysis is often conservative.

Resource Augmentation

Sometimes an online algorithm receives extra resources.

Example:

Online cache:
capacity k + 1

Offline cache:
capacity k

This model is called resource augmentation.

The extra resource can dramatically improve competitive ratios.

Modern scheduling theory frequently uses this framework.

Real-World Applications

Online algorithms appear everywhere.

Operating Systems

  • Page replacement
  • CPU scheduling
  • Memory allocation

Networks

  • Routing
  • Congestion control
  • Packet scheduling

Cloud Platforms

  • VM placement
  • Autoscaling
  • Load balancing

Databases

  • Buffer management
  • Query scheduling
  • Cache replacement

Finance

  • Trading decisions
  • Portfolio rebalancing
  • Risk management

Most production systems are fundamentally online systems.

Comparison with Approximation Algorithms

Property Approximation Online
Future known Yes No
Benchmark Optimal solution Optimal offline solution
Guarantee Approximation ratio Competitive ratio
Randomization Optional Often valuable
Input model Static Dynamic

Both frameworks compare against optimal solutions.

The source of difficulty differs.

Common Mistakes

Comparing Against Other Online Algorithms

Competitive analysis compares against:

Optimal offline

not another heuristic.

Ignoring Adversarial Sequences

Worst-case inputs determine guarantees.

Assuming Random Inputs

Competitive analysis does not require random arrivals.

Forgetting Irreversibility

Online decisions are often permanent.

This constraint fundamentally changes algorithm design.

Discussion

Online algorithms formalize decision-making under uncertainty. Unlike traditional algorithms, they must act before all information is available. Competitive analysis provides a rigorous framework for measuring performance by comparing against an optimal offline algorithm with complete knowledge of the future.

The ski rental problem, caching, paging, scheduling, routing, and load balancing all demonstrate the same central challenge: balancing current costs against unknown future opportunities. Randomization frequently improves performance, and many of the most important online algorithms combine probabilistic techniques with competitive analysis.

The next section explores streaming algorithms, where data arrives online at massive scale and memory constraints become just as important as uncertainty about future inputs.