The bell curve

Heights, test scores, measurement errors, stock returns — an astonishing range of real-world quantities pile up in the middle and thin out at the extremes. This bell-shaped pattern isn't a coincidence. It's a consequence of adding up many small random effects.

What do you think?
If you sum 100 fair coin flips (each ±1), what shape does the histogram of possible totals form?

The PDF

Normal Distribution

A random variable XX has a Normal (Gaussian) distribution with mean μ\mu and variance σ2\sigma^2 if its PDF is: f(x)=1σ2πe(xμ)22σ2f(x) = \Large\frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}} We write XN(μ,σ2)X \sim N(\mu, \sigma^2).

The curve is:

  • Symmetric about μ\mu
  • Bell-shaped: highest at μ\mu, decaying exponentially in the tails
  • Determined entirely by two parameters: μ\mu (center) and σ\sigma (spread)
Normal Distribution
-55
0.53
-3σ-2σ-1σμ=0.0+1σ+2σ+3σ68.27%
Shade region
P(shaded)
68.27%
Mean
0.0
σ
1.00
Variance
1.00
The 68-95-99.7 Rule
±1σ
68.27%
±2σ
95.45%
±3σ
99.73%

Move μ\mu to shift the bell. Change σ\sigma to widen or narrow it. Toggle the σ\sigma-regions to see the famous 68-95-99.7 rule in action.

The 68-95-99.7 rule

Empirical Rule

For any Normal distribution N(μ,σ2)N(\mu, \sigma^2):

  • 68.27% of values fall within ±1σ\pm 1\sigma of the mean
  • 95.45% of values fall within ±2σ\pm 2\sigma of the mean
  • 99.73% of values fall within ±3σ\pm 3\sigma of the mean

This rule gives you fast mental estimates without any calculation.

SAT scores follow N(1060, 200²). About what percentage score between 860 and 1260? (whole number)
For N(100, 15²), approximately what percentage of values are below 70? (decimal, e.g. 0.42)

The standard Normal

Working with N(μ,σ2)N(\mu, \sigma^2) directly would require a different table for every μ\mu and σ\sigma. Instead, we standardize.

Standard Normal & Z-Score

The standard Normal is ZN(0,1)Z \sim N(0, 1). Any Normal can be standardized: Z=XμσZ = \Large\frac{X - \mu}{\sigma} This ZZ tells you how many standard deviations XX is from the mean.

Converting to z-scores reduces every Normal problem to the same distribution.

Z-Score Scale
10σ
Probability beyond 2σ
2.3%
What do you think?
Heights of adult men follow N(178, 7²) cm. A man is 192 cm tall. What is his z-score?
Enter a whole number

Properties

Normal Properties

If XN(μ,σ2)X \sim N(\mu, \sigma^2):

  • E[X]=μE[X] = \mu
  • Var(X)=σ2\text{Var}(X) = \sigma^2
  • aX+bN(aμ+b,  a2σ2)aX + b \sim N(a\mu + b,\; a^2\sigma^2)
  • If X1,,XnX_1, \ldots, X_n are independent Normal, then Xi\sum X_i is also Normal

The Normal family is closed under addition. Sums of independent Normals stay Normal.

Explore what happens when you add two Normal distributions — adjust each curve's mean and spread, and watch the sum:

Sum of Normals
If X ~ N(μ₁, σ₁²) and Y ~ N(μ₂, σ₂²) are independent, then X + Y ~ N(μ₁+μ₂, σ₁²+σ₂²)
X ~ N(μ₁, σ₁²)
-55
0.53
Y ~ N(μ₂, σ₂²)
-55
0.53
0.000.150.310.46
X Y X + Y
μ₁ + μ₂
2.0
σ₁² + σ₂²
3.25
σ(X+Y)
1.80
Means add. μ₁ + μ₂ = 0.0 + 2.0 = 2.0. Variances add (not standard deviations): σ₁² + σ₂² = 1.00 + 2.25 = 3.25. The sum is wider than either alone but narrower than you might expect.
If X ~ N(10, 4) and Y ~ N(5, 9) are independent, what is Var(X + Y)? (whole number)
If X ~ N(100, 25), what distribution does 2X + 3 follow? Give the mean. (whole number)

Why the Normal?

The Central Limit Theorem (CLT) explains the ubiquity of the bell curve:

The sum (or average) of many independent random variables — regardless of their individual distributions — converges to a Normal distribution as the count grows. That's why the bell curve appears whenever many small random effects add up.

Heights = genetics + nutrition + many small factors. Measurement error = many tiny instrument wobbles. Stock returns = many traders' decisions.

See the CLT in action — pick any source distribution and watch the sample mean converge to a bell curve:

Central Limit Theorem
Source distribution (draw from)
150
Click a button to generate sums
Sums generated
0
Theory: E[S]
5.00
Theory: SD(S)
2.24
Observed mean

Summary

PropertyFormula
PDFf(x)=1σ2πe(xμ)2/(2σ2)f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-(x-\mu)^2/(2\sigma^2)}
MeanE[X]=μE[X] = \mu
VarianceVar(X)=σ2\text{Var}(X) = \sigma^2
Z-scoreZ=(Xμ)/σZ = (X - \mu) / \sigma
68-95-99.7±1σ: 68%, ±2σ: 95%, ±3σ: 99.7%
ClosureSum of independent Normals is Normal

The CLT guarantees the Normal distribution appears whenever many small independent effects combine, making it the universal attractor for sums and averages.

What's next

From the bell curve to the "waiting time" curve — the Exponential distribution, the continuous counterpart of the Geometric, with its own memoryless property.