The risk puzzle

Two mutual funds both averaged 8% annual returns over 10 years. Yet one investor made steady gains while the other had a rollercoaster ride. What's different?

What do you think?
Fund A returned exactly 8% every year. Fund B returned alternating +28% and -12%. Both average 8%. Which is riskier?

The mean tells you the center. Variance tells you how far outcomes typically stray from that center.

Variance: measuring spread

Variance

The variance of a random variable XX is: Var(X)=E[(Xμ)2]=E[X2](E[X])2\text{Var}(X) = E[(X - \mu)^2] = E[X^2] - (E[X])^2 where μ=E[X]\mu = E[X]. It measures the average squared distance from the mean.

Why square the deviations? Because positive and negative deviations would cancel. Squaring ensures all deviations count.

The second formula — E[X2](E[X])2E[X^2] - (E[X])^2 — is almost always easier to compute.

What do you think?
If X is constant (always equals 5), what is Var(X)?
Enter a whole number

Standard deviation

Standard Deviation

The standard deviation is the square root of variance: SD(X)=σ=Var(X)\text{SD}(X) = \sigma = \sqrt{\text{Var}(X)} It has the same units as XX, making it more interpretable than variance.

If XX is measured in dollars, Var(X)\text{Var}(X) is in "dollars squared" (meaningless!), but SD(X)\text{SD}(X) is back in dollars.

Drag and explore

Drag points away from the mean to see how variance responds. Notice that outliers have a disproportionate effect because of the squaring:

The Spread Stretch
02468101214μ = 6.22455666778810
Drag points left/right to change their values. Watch variance respond.
Mean (μ)
6.17
Variance (σ²)
3.97
Std Dev (σ)
1.99
Points
12
Squared deviations from mean
(-4.2)² = 17.4(-2.2)² = 4.7(-1.2)² = 1.4(-1.2)² = 1.4(-0.2)² = 0.0(-0.2)² = 0.0(-0.2)² = 0.0(+0.8)² = 0.7(+0.8)² = 0.7(+1.8)² = 3.4(+1.8)² = 3.4(+3.8)² = 14.7
Sum = 47.7 ÷ 12 = 3.97

Key properties of variance

Variance Rules

For constants aa and bb: Var(aX+b)=a2Var(X)\text{Var}(aX + b) = a^2 \cdot \text{Var}(X) Shifting by bb doesn't change the spread. Scaling by aa multiplies variance by a2a^2.

Shifting does not affect variance. Adding a constant to every outcome moves the center but doesn't change how spread out the data is.

OperationEffect on MeanEffect on Variance
X+cX + cμ+c\mu + cNo change
aXaXaμa\mua2σ2a^2 \sigma^2
aX+baX + baμ+ba\mu + ba2σ2a^2 \sigma^2

Experiment with scaling and shifting to build intuition for why only aa (not bb) affects variance:

Variance Scaling Lab
Transform Y = aX + b where X is a fair die roll
-33
-1010
0.000.050.100.150.20E[X]=3.5E[Y]=3.5
X (original) Y = 1.0X + 0
E[X]
3.50
E[Y] = aE[X]+b
3.50
Var(X)
2.92
Var(Y) = a²Var(X)
2.92
Y = X (no transformation). Try changing a or b!

Variance of a sum

For independent random variables:

Var(X+Y)=Var(X)+Var(Y)(if XY)\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) \quad \text{(if } X \perp Y\text{)}

Unlike linearity of expectation, this requires independence! For dependent variables: Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) + 2\text{Cov}(X,Y).

Worked example: dice

For a single fair die roll XX:

Variance of a fair die
E[X]=1+2+3+4+5+66=3.5E[X] = \frac{1+2+3+4+5+6}{6} = 3.5
Mean
Step 1 of 4

Practice problems

X has E[X] = 10 and E[X²] = 120. What is Var(X)? (whole number)
If Var(X) = 9, what is Var(3X)? (whole number)
If Var(X) = 9, what is Var(X + 100)? (whole number)

Summary

ConceptFormula
VarianceVar(X)=E[X2](E[X])2\text{Var}(X) = E[X^2] - (E[X])^2
Standard deviationσ=Var(X)\sigma = \sqrt{\text{Var}(X)}
Scaling ruleVar(aX+b)=a2Var(X)\text{Var}(aX + b) = a^2 \text{Var}(X)
Sum (independent)Var(X+Y)=Var(X)+Var(Y)\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y)

Variance captures risk, volatility, and uncertainty. When two distributions have the same mean, variance is what separates them.

Test your understanding

A coin flip: X = 1 (heads) or 0 (tails), each with probability 1/2. What is Var(X)? (decimal, e.g. 0.42)
You flip 100 independent fair coins. What is the variance of the total number of heads? (whole number)
Temperature is converted via F = 1.8C + 32. If Var(C) = 4, what is Var(F)? (decimal, e.g. 0.42)

What's next

Variance tells us how spread out a distribution is, but not which direction. Next we'll explore skewness (asymmetry) and kurtosis (tail heaviness).