Note: You are looking at a static copy of the former PineWiki site, used for class notes by James Aspnes from 2003 to 2012. Many mathematical formulas are broken, and there are likely to be other bugs as well. These will most likely not be fixed. You may be able to find more up-to-date versions of some of these notes at http://www.cs.yale.edu/homes/aspnes/#classes.

The power of two choices is the observation that if we put n balls into n bins, the maximum load is much smaller if each ball can choose the lighter of two randomly-selected bins (without replacement) instead of just picking one bin at random. We'll do the upper bound proof from MitzenmacherUpfal §14.1 that with d≥2 choices the maximum load is ln ln n / ln d + O(1) with probability 1 - o(1/n); this is much better than the Θ(log n / log log n) we previously saw for the d=1 case.

1. Intuition

For each ball j=1..n, define its height h(j) as the number of balls in its bin after it is placed. To get a ball of height i+1 or higher, we have to pick d bins with i balls or more. Suppose that we have a bound β_i on the number of bins that ever have i or more balls. Then the probability that we pick d such bins is at most (β_i/n)^d. We will then use Chernoff bounds to show that the total number of height i+1 or higher balls is at most 2n(β_i/n)^d with high probability, or that β_i+1/n ≤ 2(β_i/n)^d. This gives a recurrence for the concentration of high balls β_i.

2. Probabilistic implications

One messy part of the argument is that in arguing that there are few height i+1 balls, we are depending on there being few height i balls. This sounds like a conditional probability, but if we condition on having few height i balls the process gets very messy (for example, it's not clear whether this increases or decreases the probability of having a lot of height i+1 balls; maybe the reason we have so few height i balls is that we have an usually large number of very full bins, leaving not so many balls to fill up other bins to height i). So instead we will look at events of the form [number of height i balls ≤ β_i ⇒ number of height i+1 balls ≤ β_i+1]. The negation of this event is [number of height i balls ≤ β_i ∧ number of height i+1 balls > β_i+1]; by showing that each of these bad events has low probability we can us the union bound to show that all the implications hold except with low probability, which will show that the entire chain of events holds.

3. Main induction step

Formally, let ν_i be the number of balls at height i or greater at the end of the process. Let β_i be given by the recurrence β₄ = n/4, β_i+1/n = 2(β_i/n)^d. We want to show that ν_i ≤ β_i for all sufficiently large i with high probability.

For i=4, we have Pr[ν_i ≤ β_i] = 1; there aren't enough balls to get more than n/4 with height 4 or greater.

Now let i≥4 and look at ν_i+1. Let X_j be the indicator variable for the event [ν_i ≤ β_i ∧ ball j picks d bins with at least i balls each]. Then E[X_j|X₁..X_j-1] ≤ (β_i/n)^d, because either (a) at time j there are already more than β_i balls at height i, and X_j = 0, or (b) at time j there are not more than β_i balls at height i, the probability that we pick d bins that contain tall balls is bounded by (β_i/n)^d, and the probability that X_j=1 is no more than this (it may be less either because our estimate of the number of tall bins is too high or because ν_i later exceeds β_i). This gives us a sequence of indicator variables where the expectation of each conditioned on the previous ones is bounded.

Unfortunately these variables are not independent, so we can't apply Chernoff bounds directly. We can probably apply Azuma-Hoeffding with a bit of work, but there is a sneaky trick that lets us apply Chernoff bounds anyway. The trick is that we can build a coupled collection of independent random variables Y₁...Y_n where X_j ≤ Y_j always and each Y_j=1 with probability exactly (β_i/n)^d. So then ∑ X_j ≤ ∑ Y_j and we have Pr[ν_i ≤ β_i ∧ ν_i+1 > β_i+1] ≤ Pr[∑ X_j > β_i+1] ≤ Pr[∑ Y_j > β_i+1] = Pr[∑ Y_j > 2 E[∑ Y_j]] ≤ (e²/2²)^np ≤ e^-np/3 where p = (β_i/n)^d. This bound will be less than n^-2 as long as np ≥ 6 ln n or p ≥ 6 ln n / n. When np gets smaller, we will have to switch to a different method.

4. Solving the recurrence

MitzenmacherUpfal give the solution to the recurrence β_i/n = 2(β_i/n)^d as β_i+4 = $\frac{n}{2^{2d^i-\sum_{j=0}^{i-1} d^j}}$ . The proof is by the usual plug-and-chug method, so we will omit it here. Since the sum in the exponent is annoying, we can simplify the bound to β_i+4 $1/2^{\Theta{d^i}}$ by observing that the sum is (dⁱ-1)/(d-1) = Θ(dⁱ). So the break-even point where p = (β_i/n)^d drops below 6 ln n / n is at some i^* = ln lg n / ln d + Θ(1) = ln ln n / ln d + Θ(1).

5. Final step

So here we are at i^*, and we have ν_i* ≤ β_i* with some high probability. We also have that p = (β_i*/n)^d ≤ 6 ln n / n (otherwise we could keep using the Chernoff bounds). By the coupling argument from above, we get Pr[ν_i* ≤ β_i* ∧ ν_i*+1 > 18 ln n] ≤ Pr[B(n, 6 ln n / n) ≥ 18 ln n], where B(n,p) is the usual binomial random variable. From Chernoff bounds we have that the probability that this event occurs is at most (e³/3³)^{6 ln n} ≤ 1/n².

Now let's look at i^*+2. Here we argue Pr[ν_i*+1 ≤ 18 ln n ∧ ν_i*+2 ≥ 2] ≤ Pr[B(n, (18 ln n / n)^d) ≥ 2] ≤ (n choose 2) (18 ln n / n)^2d ≤ 1/n² = O(ln⁴n / n²). The (n choose 2) is all different ways of choosing 2 possible height i^*+2 balls, and the (18 ln n / n)^2d bounds the probability that both balls are too tall.

Finally, at i^*+3, we have Pr[ν_i*+2 ≤ 1 ∧ ν_i*+3 ≥ 1] = 0, since we can't pick d≥2 distinct non-empty bins if ν_i*+2 ≤ 1.

Summing up all the probabilities of failure gives (i^*+2)/n² + O(ln⁴ n / n²) = o(1/n).

6. Lower bound

The bound given here is tight up to an additive constant. With n balls in n bins we also get some bin with at least ln ln n / ln d - O(1) with probability at least 1 - o(1/n). The proof of the lower bound is approximately as horrible as the proof of the upper bound. See MitzenmacherUpfal §14.2 for details.

CategoryRandomizedAlgorithmsNotes