The independence question
You're at a casino. The roulette wheel has landed on red five times in a row.
Our brains are wired to see patterns, even where none exist. Independence is the mathematical way to say "knowing one thing tells you nothing about another."
Defining independence
We've seen independence for events: and are independent if .
For random variables, we extend this idea:
Random variables and are independent if for all values and :
Equivalently, knowing doesn't change the distribution of .
In words: the joint probability factors into the product of individual probabilities.
What independence means
When and are independent:
- No information transfer: Observing tells you nothing about
- Conditional equals unconditional:
- The joint PMF factors:
Independence means you can analyze and separately. Their stories don't interact.
Example: two dice
Roll two fair dice. Let = first die, = second die.
The physical separation of the dice makes them independent. Verifying mathematically:
This factorization works for every pair of values. That's what makes them independent.
Calculation shortcuts
Independence gives us shortcuts that simplify calculations.
Rule 1: expectations multiply
If and are independent:
For dependent variables, would require knowing the joint distribution. For independent variables, we just multiply the means.
Rule 2: variances add
If and are independent:
This is why errors from independent sources add in quadrature. Uncertainties don't compound as badly as you might fear.
For dependent variables, we'd need a covariance term: .
Independence means .
IID: the gold standard
In statistics, we often assume data points are "IID":
Random variables are IID (independent and identically distributed) if:
- They are mutually independent
- They all have the same distribution
Examples:
- Coin flips are IID Bernoulli
- Repeated measurements (done carefully) are IID
- Random samples from a large population are approximately IID
Most statistical procedures assume IID data. When this assumption fails, results can be misleading.
Independence vs. correlation
and are uncorrelated if .
Key relationships:
- Independent ⟹ Uncorrelated (always true)
- Uncorrelated ⟹ Independent (NOT always true!)
Counterexample: Let and .
They're uncorrelated (E[XY] = E[X]E[Y] = 0), but knowing completely determines !
Why independence matters
Simplifies calculations
- Joint distributions factor
- Expectations multiply
- Variances add
Enables statistical inference
- The Law of Large Numbers needs independence
- The Central Limit Theorem needs independence
- Most confidence intervals assume independent samples
Real-world consequences
Assuming independence when it's false leads to underestimating risk. The 2008 financial crisis was partly due to assuming mortgage defaults were independent (they weren't).
Summary
| For Independent RVs | Formula |
|---|---|
| Joint PMF | |
| Expected product | |
| Variance of sum | |
| Covariance |
Independence is about information. and are independent when learning leaves your beliefs about unchanged.
Test your understanding
What's next
You now have the core concepts of random variables: PMFs, CDFs, distributions, and independence. These form the foundation for continuous distributions, the Law of Large Numbers, and the Central Limit Theorem.