The urn problem
Imagine an urn with 50 balls: 20 red and 30 blue. You draw 10 balls.
Two sampling strategies
With Replacement (Binomial):
- Each draw is independent
- Probability stays constant:
- Number of reds
Without Replacement (Hypergeometric):
- Each draw depends on previous draws
- After drawing a red, the probability of red changes
- This is the Hypergeometric distribution
The hypergeometric distribution
Drawing items without replacement from a population of items containing successes:
We write .
The formula counts:
- : Ways to choose successes from available (using combinations)
- : Ways to choose the remaining from failures
- : Total ways to choose items
See how the three combinatorial pieces build the probability:
Explore the difference
With Replacement (Binomial)
Without Replacement (Hypergeom)
Notice: the centers (means) are the same, but without replacement has less spread.
Once you've drawn many reds, there are fewer reds left. The remaining draws can't all be red, so outcomes become negatively correlated.
Mean: the same!
For both Binomial and Hypergeometric: where is the proportion of successes.
Variance: here's the difference
Binomial:
Hypergeometric:
The factor is called the Finite Population Correction (FPC).
The key takeaway
| Distribution | Mean | Variance |
|---|---|---|
| Binomial | ||
| Hypergeometric |
The FPC is always ≤ 1, so hypergeometric variance is always lower.
Rule of thumb: If your sample is less than 5% of the population (), the Binomial is a good approximation.
What's next
In the next lesson, we'll explore when to use each distribution, work through practical examples like card games and quality control, and understand the FPC in more depth.