The urn problem

Imagine an urn with 50 balls: 20 red and 30 blue. You draw 10 balls.

What do you think?
Does it matter whether you put each ball back before drawing the next?

Two sampling strategies

With Replacement (Binomial):

  • Each draw is independent
  • Probability stays constant: p=20/50=0.4p = 20/50 = 0.4
  • Number of reds Bin(10,0.4)\sim \text{Bin}(10, 0.4)

Without Replacement (Hypergeometric):

  • Each draw depends on previous draws
  • After drawing a red, the probability of red changes
  • This is the Hypergeometric distribution

The hypergeometric distribution

Hypergeometric Distribution

Drawing nn items without replacement from a population of NN items containing KK successes:

P(X=k)=(Kk)(NKnk)(Nn)P(X = k) = \Large\frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}

We write XHGeom(N,K,n)X \sim \text{HGeom}(N, K, n).

The formula counts:

  • (Kk)\binom{K}{k}: Ways to choose kk successes from KK available (using combinations)
  • (NKnk)\binom{N-K}{n-k}: Ways to choose the remaining from failures
  • (Nn)\binom{N}{n}: Total ways to choose nn items

See how the three combinatorial pieces build the probability:

Formula Breakdown
Formula
Defective
5
Good
95
C(K,k) × C(N-K,n-k) / C(N,n)
P(X=k) = C(K,k) × C(N-K,n-k) / C(N,n)
1/5

Explore the difference

The Urn Simulator
20 + 30 = 50

With Replacement (Binomial)

0
1
2
3
4
5
6
7
8
9
10

Without Replacement (Hypergeom)

0
1
2
3
4
5
6
7
8
9
10

Notice: the centers (means) are the same, but without replacement has less spread.

Once you've drawn many reds, there are fewer reds left. The remaining draws can't all be red, so outcomes become negatively correlated.

Mean: the same!

Expected Value

For both Binomial and Hypergeometric: E[X]=nKN=npE[X] = n \cdot \Large\frac{K}{N} \normalsize= np where p=K/Np = K/N is the proportion of successes.

Urn: 50 balls, 20 red. Draw 10 WITH replacement. E[reds] = ? (whole number)
Same urn, draw 10 WITHOUT replacement. E[reds] = ? (whole number)

Variance: here's the difference

Variance Comparison

Binomial: Var(X)=np(1p)\text{Var}(X) = np(1-p)

Hypergeometric: Var(X)=np(1p)NnN1\text{Var}(X) = np(1-p) \cdot \frac{N-n}{N-1}

The factor NnN1\frac{N-n}{N-1} is called the Finite Population Correction (FPC).

Variance Comparison
Binomial Hypergeometric
550
Mean (both)
4.0
Var (Bin)
2.40
Var (Hyp)
2.18
FPC = (100-10)/(100-1) = 0.909
Binomial Var for n=10, p=0.4: np(1-p) = ? (decimal to 1 place, e.g. 3.7)
Hypergeometric Var with N=50? Multiply by (50-10)/(50-1). (decimal to 2 places, e.g. 0.53)

The key takeaway

DistributionMeanVariance
Binomialnpnpnp(1p)np(1-p)
Hypergeometricnpnpnp(1p)×FPCnp(1-p) \times \text{FPC}

The FPC is always ≤ 1, so hypergeometric variance is always lower.

Rule of thumb: If your sample is less than 5% of the population (n<0.05Nn < 0.05N), the Binomial is a good approximation.

What's next

In the next lesson, we'll explore when to use each distribution, work through practical examples like card games and quality control, and understand the FPC in more depth.