The expected value or expectation of some quantity X (written E[X] or sometimes just EX) is just the average of all possible values of X weighted by their probabilities. Formally, if X takes on values in some set S, then
E[X] = Sigmax in S x Pr[X=x]
From the definition it is not hard to show several useful properties about expectations:
- E[aX] = a E[X] when a is a constant.
- E[X+Y] = E[X]+E[Y].
These properties together are known as linearity of expectation.
The additive property applies for any random values X and Y, even if they are not independent. For example, consider an algorithm that splits an input array of n elements into two arrays, where one contains anywhere from 0 to n elements (with all n+1 possibilities equally likely) and the other contains the rest; it then runs a Theta(n2) subroutine on both subarrays. Let T1 be the running time of the subroutine on the first array, and let T2 be the running time on the second array. Let T be the total running time. Then E[T] = E[T1+T2] = E[T1]+E[T2] = 2E[T1] (by symmetry). The value of E[T1] can be calculated from the definition (it's Theta(n2)). In this analysis, it is possible to consider each of T1 and T2 separately, even though their actual values are closely linked.
For more complicated calculations it can sometimes be useful to split the possibilities into several cases. The conditional expectation of a variable X given an event A (written E[X|A]) is the average value of X over all cases where A occurs. Formally, it is defined for a variable X that takes on values in S by
E[X|A] = Sigmax in S x Pr[X=x|A],
where Pr[X=x|A] is the probability that X equals x conditioned on A, which is in turn defined formally by
- Pr[X=x|A] = Pr[X=x and A]/Pr[A].
The usefulness of conditional probabilities is that if we have a set of mutually exclusive events A1, A2, An whose probabilities sum to 1---representing some set of n cases that together exhaust all possibilities---we can compute the expectation of X by taking a weighted average over the conditional expectations in each case:
E[X] = Sigmai=1 to n Pr[Ai]E[X|Ai].
The formal definition of E[X] that we gave up above is actually a special case of this formula.
1. Examples
Suppose X takes on the values 1 through n with equal probabilities. What is E[X2]? Use the conditional probability formula with event Ai being the event that X=i: E[X2] = Sigmai=1 to n i2*(1/n) = (1/n)Sigmai=1 to ni2 = (1/n)Theta(n3) = Theta(n2).
- A startup offers you a job that involves working 120 hours per week for a year, at the end of which you will be vested with stock options that will have a value somewhere between $0 and $10 million, with all values in this range equally likely. However, there is a 40% chance that you will die of overwork before you can collect, which standard formulas used in cost-benefit analysis count as a $4 million loss. Your expected return if you don't die is $5 million; your total expected return is thus (-4000000)*0.40+(5000000)*0.60 = $1,400,000, an excellent salary for a year's work. (This example may show why looking only at expectations may not always give a complete picture of random outcomes.)