Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

1. A simple problem

Suppose that we are looking for a particular value in an unsorted array (and that we know that this value does appear somewhere in the array), and we are charged 1 unit for each array element we examine. The deterministic worst-case cost is easily seen to be exactly n: the adversary can simulate our deterministic algorithm to find the array location it looks at last, and put the target there. But for a randomized algorithm, we can reduce the average cost to (n+1)/2 using the following simple algorithm:

if fairCoin() = heads:
  for i=1 to n:
    if A[i] = x:
      return i
else:
  for i=n downto 1:
    if A[i] = x:
      return i

The proof is that if the target is in A[k], then half the time we look at k elements, and the other half the time we look at n-k+1 elements; the expected cost is thus (1/2)(k+(n-k+1)) = (n+1)/2.

Can we do better? No (unless we can bring in magical quantum fairy dust—see Grover's algorithm). But how do we prove this?

2. Yao's lemma

Suppose we can find a distribution over inputs of size n such that every deterministic algorithm has average cost ≥T on this distribution (slightly more formally, ∀M we have E_X[cost(M(x))] ≥ T). Any randomized algorithm M(x,r) can be thought of as a random choice of a deterministic algorithm M_r(x). Since each M_r has E_X[cost(M_r(x)) ≥ T], we get E_r[E_X[cost(M_r(X)]] ≥ T, which we can rewrite as E_X[E_r[cost(M_r(X))]] ≥ T. But then there is some particular x such that E_r[cost(M_r(x))] ≥ T.

In other words, if there is a single input distribution that makes all deterministic algorithms bad, for each randomized algorithm there is a single input that makes it bad. This is Yao's lemma.

3. Lower bound on the simple problem

Going back to our simple problem, suppose the adversary places the target in a uniform random location. For any deterministic algorithm, there is some sequence of locations it examines—and this sequence is fixed as long as it hasn't yet found the target, because the other locations always contain the same empty value. Assuming there are no duplicates in this sequence, the target is equally likely to be in the first n positions, giving an (n+1)/2 average-case cost to find it. Yao's lemma now implies that for any randomized algorithm there is a specific location we can put the target in so that the expected cost of the randomized algorithm is at least (n+1)/2.

4. Game tree evaluation

See AND/OR tree example from MotwaniRaghavan chapter 2.

CategoryRandomizedAlgorithmsNotes