1. A simple problem
Suppose that we are looking for a particular value in an unsorted array (and that we know that this value does appear somewhere in the array), and we are charged 1 unit for each array element we examine. The deterministic worst-case cost is easily seen to be exactly n: the adversary can simulate our deterministic algorithm to find the array location it looks at last, and put the target there. But for a randomized algorithm, we can reduce the average cost to (n+1)/2 using the following simple algorithm:
if fairCoin() = heads: for i=1 to n: if A[i] = x: return i else: for i=n downto 1: if A[i] = x: return i
The proof is that if the target is in A[k], then half the time we look at k elements, and the other half the time we look at n-k+1 elements; the expected cost is thus (1/2)(k+(n-k+1)) = (n+1)/2.
Can we do better? No (unless we can bring in magical quantum fairy dust—see Grover's algorithm). But how do we prove this?
2. Yao's lemma
Suppose we can find a distribution over inputs of size n such that every deterministic algorithm has average cost ≥T on this distribution (slightly more formally, ∀M we have EX[cost(M(x))] ≥ T). Any randomized algorithm M(x,r) can be thought of as a random choice of a deterministic algorithm Mr(x). Since each Mr has EX[cost(Mr(x)) ≥ T], we get Er[EX[cost(Mr(X)]] ≥ T, which we can rewrite as EX[Er[cost(Mr(X))]] ≥ T. But then there is some particular x such that Er[cost(Mr(x))] ≥ T.
In other words, if there is a single input distribution that makes all deterministic algorithms bad, for each randomized algorithm there is a single input that makes it bad. This is Yao's lemma.
3. Lower bound on the simple problem
Going back to our simple problem, suppose the adversary places the target in a uniform random location. For any deterministic algorithm, there is some sequence of locations it examines—and this sequence is fixed as long as it hasn't yet found the target, because the other locations always contain the same empty value. Assuming there are no duplicates in this sequence, the target is equally likely to be in the first n positions, giving an (n+1)/2 average-case cost to find it. Yao's lemma now implies that for any randomized algorithm there is a specific location we can put the target in so that the expected cost of the randomized algorithm is at least (n+1)/2.
4. Game tree evaluation
See AND/OR tree example from MotwaniRaghavan chapter 2.