The chaos becomes order

Roll a single die — the outcomes are flat, not bell-shaped at all. But sum 30 dice rolls? The result looks approximately Normal. This works for any starting distribution.

What do you think?
You sum 50 independent samples from a wildly skewed distribution. What does the sum's distribution look like?

This is the Central Limit Theorem.

Statement of the CLT

Central Limit Theorem

Let X1,X2,,XnX_1, X_2, \ldots, X_n be i.i.d. random variables with mean μ\mu and variance σ2\sigma^2. Then as nn \to \infty: Xˉnμσ/ndN(0,1)\frac{\bar{X}_n - \mu}{\sigma / \sqrt{n}} \xrightarrow{d} N(0, 1) Equivalently, the sum Sn=X1++XnS_n = X_1 + \cdots + X_n is approximately N(nμ,nσ2)N(n\mu, n\sigma^2) for large nn.

If you standardize the sample mean, it becomes approximately standard Normal, regardless of the original distribution.

Requirements: The XiX_i must be independent, identically distributed, with finite mean and variance. If the variance is infinite (e.g., Cauchy distribution), the CLT does not apply.

See it yourself

Draw from a crazy distribution — exponential, uniform, or even bimodal. Watch the sum distribution converge to a bell curve as you increase nn:

Central Limit Theorem
Source distribution (draw from)
150
Click a button to generate sums
Sums generated
0
Theory: E[S]
5.00
Theory: SD(S)
2.24
Observed mean

Why does it work? Intuition

CLT Intuition
Each Xi has some arbitrary shape (skewed, flat, bimodal...)\text{Each } X_i \text{ has some arbitrary shape (skewed, flat, bimodal...)}
Start with any distribution
Step 1 of 5

The mathematical proof uses moment-generating functions: the MGF of the standardized sum converges to et2/2e^{t^2/2}, which is the MGF of N(0,1)N(0,1).

The CLT in practice

How large must nn be?

Starting DistributionMinimum nn for good approximation
Symmetric (e.g., Uniform)n10n \geq 10
Mildly skewedn20n \geq 20
Heavily skewed (e.g., Exponential)n30n \geq 30
Very heavy tailsn50+n \geq 50+

The more "non-Normal" the original distribution, the more samples you need.

Polling and elections

What do you think?
A pollster surveys n = 1000 voters. Each responds Yes (1) or No (0) with unknown probability p. By the CLT, the sample proportion p̂ is approximately Normal. If the true p = 0.52, what's the standard error?
decimal to 4 places, e.g. 0.1234

The CLT connects everything

The CLT explains why the Normal distribution appears everywhere. Test scores are the sum of many small factors (preparation, sleep, luck). Measurement errors accumulate from many tiny perturbations. Stock returns aggregate many independent trades. Heights are shaped by many independent genetic and environmental factors.

The Normal distribution is the attractor: the shape everything converges to when you add enough independent things together.

Continuity correction

When approximating a discrete distribution (like Binomial) with the Normal:

P(Xk)Φ(k+0.5npnp(1p))P(X \leq k) \approx \Phi\left(\frac{k + 0.5 - n p}{\sqrt{n p (1-p)}}\right)

The +0.5+0.5 is the continuity correction — it accounts for the fact that the Normal is continuous but the Binomial is discrete.

Practice problems

You sum 100 independent Expo(1) random variables. What is the approximate mean of the sum? (whole number)
Same 100 Expo(1) variables. What is the approximate standard deviation of the sum? (whole number)
You average 400 fair coin flips. The sample mean p̂ is approximately N(0.5, σ²). What is σ (the standard deviation of p̂)? (decimal to 3 places, e.g. 0.456)

CLT vs LLN

Law of Large NumbersCentral Limit Theorem
SaysXˉnμ\bar{X}_n \to \muXˉn\bar{X}_n is approximately Normal
TypeConvergence in probabilityConvergence in distribution
Tells youWhere the average goesHow it fluctuates around there
Key formulaP(Xˉnμ>ϵ)0P(\|\bar{X}_n - \mu\| > \epsilon) \to 0Xˉnμσ/nN(0,1)\frac{\bar{X}_n - \mu}{\sigma/\sqrt{n}} \approx N(0,1)

The LLN says the average converges. The CLT says how fast and in what shape it converges. The LLN is qualitative; the CLT is quantitative.

Summary

ConceptKey Formula
CLT (standardized)Xˉnμσ/ndN(0,1)\frac{\bar{X}_n - \mu}{\sigma/\sqrt{n}} \xrightarrow{d} N(0,1)
Sum versionSnN(nμ,nσ2)S_n \approx N(n\mu, n\sigma^2)
Standard errorSE=σ/n\text{SE} = \sigma / \sqrt{n}
Continuity correctionReplace kk with k±0.5k \pm 0.5 for discrete→continuous

Test your understanding

The CLT requires independence and finite ___. (one word)
You sum n = 36 i.i.d. variables with mean 10 and variance 9. The sum is approximately Normal with mean ___ and variance ___. What is the mean? (whole number)
Same setup: what is the standard deviation of the sum? (whole number)

What's next

With the CLT in hand, we're ready to tackle Markov chains, where the future depends only on the present.