The lottery question

If you play the lottery every week for a year, how much can you expect to lose?

What do you think?
A game costs $2 to play and pays $1000 with probability 1/1000. What's the expected value of playing 52 weeks?

The answer is simple: we just multiply. That's linearity of expectation.

Expected value: a quick review

Expected Value

For a discrete random variable XX with PMF p(x)p(x): E[X]=xxP(X=x)E[X] = \Large\sum_x x \cdot P(X = x) It's the probability-weighted average — the "center of mass" of the distribution.

For a fair die: E[X]=116+216++616=3.5E[X] = 1 \cdot \Large\frac{1}{6} + 2 \cdot \Large\frac{1}{6} + \cdots + 6 \cdot \Large\frac{1}{6} = 3.5

You can never actually roll a 3.5, but over many rolls, your average will converge to it.

The linearity property

Linearity of Expectation

For any random variables XX and YY (independent or not!) and constants aa, bb: E[aX+bY]=aE[X]+bE[Y]E[aX + bY] = a \cdot E[X] + b \cdot E[Y]

No independence required. This works even when XX and YY are heavily dependent.

Try different distributions and scaling constants:

Linearity Explorer
X distribution
1
2
3
4
5
6
E=3.50
Y distribution
0
1
E=0.50
-35
-35
a·E[X]
3.500
b·E[Y]
0.500
E[aX + bY]
4.000
1·(3.50) + 1·(0.50) = 4.000

Why is this surprising?

For most operations, dependence matters. The variance of a sum depends on whether XX and YY are correlated. But expectation doesn't care.

Consider: XX = temperature in New York tomorrow, YY = number of umbrellas sold tomorrow. They're clearly dependent! Yet:

E[X+Y]=E[X]+E[Y]E[X + Y] = E[X] + E[Y]

The formula holds without conditions.

What do you think?
100 people each flip a fair coin. What's the expected total number of Heads?
Enter a whole number

See it in action

Run a simulation: X = coin flip (0/1), Y = fair die roll (1–6). Theory predicts E[X] = 0.5, E[Y] = 3.5, E[X+Y] = 4.0. Watch the averages converge:

Linearity Simulation
Samples
0
avg(X)
0.000
avg(Y)
0.000
avg(X+Y)
0.000

Scaling and shifting

Linearity includes two sub-rules:

RuleFormulaExample
ScalingE[aX]=aE[X]E[aX] = a \cdot E[X]Double all outcomes → double the mean
ShiftingE[X+c]=E[X]+cE[X + c] = E[X] + cAdd 5 to every outcome → add 5 to the mean
CombinedE[aX+b]=aE[X]+bE[aX + b] = aE[X] + bConverting Celsius to Fahrenheit

A restaurant tip calculation

Your bill is a random variable BB with E[B]=40E[B] = 40. You tip 20%, so your total is T=1.20BT = 1.20 B.

What is E[T] = E[1.20B]? (whole number)
Your friend's bill has E[B₂] = 35. What's the expected combined total (with 20% tip each)? (whole number)

A preview: this extends to N variables

Linearity generalizes to any number of random variables:

E[i=1nXi]=i=1nE[Xi]\Large E\left[\sum_{i=1}^{n} X_i\right] = \sum_{i=1}^{n} E[X_i]

We'll use this for indicator variable problems in the next lesson.

Summary

ConceptKey Formula
Expected valueE[X]=xxP(X=x)E[X] = \sum_x x \cdot P(X = x)
Linearity (sum)E[X+Y]=E[X]+E[Y]E[X + Y] = E[X] + E[Y]
Linearity (scaled)E[aX+b]=aE[X]+bE[aX + b] = aE[X] + b
Independence?Not needed for linearity!

When you see "expected value of a sum," immediately think linearity. Don't try to find the distribution of the sum — just add the expectations.

Test your understanding

Roll 3 fair dice. What's E[sum]? (decimal, e.g. 0.42)
A company has 200 employees. Each has a 2% chance of calling in sick. Expected sick calls? (whole number)
X and Y are dependent with E[X]=7, E[Y]=3. What is E[X+Y]? (whole number)

What's next

We'll use linearity with indicator variables, a technique that turns hard counting problems into simple sums.