Randomized algorithms use random choices during execution. The output or running time may depend on those choices. Randomization is useful when it simplifies design, avoids adversarial patterns, or gives good expected performance with a small amount of code.
A randomized algorithm still needs a correctness contract. The contract must say whether the algorithm is always correct, probably correct, or correct in expectation.
Problem
You want to use randomness to make an algorithm simpler or faster, but you need to describe what guarantee the algorithm provides.
Solution
Classify the algorithm by its guarantee.
Las Vegas algorithm:
Always returns a correct answer.
Running time is random.
Monte Carlo algorithm:
Running time is bounded or controlled.
Answer may be wrong with small probability.Then state the probability model clearly.
Random choices:
How random values are generated.
Success probability:
Probability that the algorithm returns a correct answer.
Expected time:
Average running time over the random choices.Example: Randomized Quicksort
Deterministic quicksort can degrade to quadratic time when pivot choices are consistently poor.
Randomized quicksort chooses a random pivot.
randomized_quicksort(A):
if length(A) <= 1:
return A
pivot = random element of A
L = elements less than pivot
E = elements equal to pivot
R = elements greater than pivot
return randomized_quicksort(L) + E + randomized_quicksort(R)The algorithm always returns a sorted array. The randomness affects the partition balance and therefore the running time.
Expected time:
O(n log n)Worst-case time:
O(n^2)This is a Las Vegas algorithm: correctness is certain, but running time is random.
Example: Random Sampling
Suppose you want to estimate the average value of a very large array.
estimate_mean(A, k):
total = 0
repeat k times:
i = random index from 0 to n - 1
total = total + A[i]
return total / kThis does not always return the exact mean. It returns an estimate whose accuracy depends on the number of samples and the distribution of values.
This is a Monte Carlo algorithm. Its running time is fixed at O(k), but the answer has sampling error.
Randomization as Symmetry Breaking
Randomness is often useful when deterministic choices may be exploited.
Examples:
| Problem | Randomized Choice |
|---|---|
| Quicksort | Random pivot |
| Hash table | Randomized hash seed |
| Load balancing | Random server choice |
| Sampling | Random subset of data |
| Graph algorithms | Random contraction or walk |
The purpose is not to be mysterious. Randomness prevents the algorithm from always making the same bad choice on structured input.
Expected Running Time
Expected time averages over the algorithm’s random choices, not necessarily over random input.
For randomized quicksort, the input can be fixed. The expectation is over the random pivot sequence.
Write this explicitly:
For every fixed input of length n,
the expected running time over the algorithm's random choices is O(n log n).This is stronger than saying the algorithm is fast only on average input.
Error Probability
For Monte Carlo algorithms, the error probability must be stated.
Example:
The algorithm returns a correct answer with probability at least 1 - δ.Here δ is the failure probability.
Many randomized algorithms can reduce error by repetition.
repeat the algorithm r times
combine the results by majority vote or best candidateIf each independent run fails with probability at most p, repeated independent trials can make the total failure probability much smaller.
Random Seeds
In implementation, randomness usually comes from a pseudo-random generator. A seed determines the sequence.
For testing, fixed seeds are useful:
seed = 12345This makes failures reproducible. For production, randomized seeds may reduce exposure to adversarial inputs.
Common Pitfalls
Do not say “randomized” without saying what is random.
Do not confuse expected time with worst-case time.
Do not use Monte Carlo algorithms where exact correctness is required unless the failure probability is acceptable and stated.
Do not rely on weak randomness for adversarial settings such as hashing exposed to untrusted input.
Do not make random tests non-reproducible. Record the seed when a test fails.
Takeaway
Randomization is a tool for controlling uncertainty. In Las Vegas algorithms, the answer is always correct and the running time is random. In Monte Carlo algorithms, the running time is controlled and the answer may have bounded error. State the random choices, the guarantee, and the probability model explicitly.