The independence question

You're at a casino. The roulette wheel has landed on red five times in a row.

What do you think?

Is black more likely on the next spin?

Our brains are wired to see patterns, even where none exist. Independence is the mathematical way to say "knowing one thing tells you nothing about another."

Defining independence

We've seen independence for events: $A$ and $B$ are independent if $P(A \cap B) = P(A)P(B)$ .

For random variables, we extend this idea:

Independent Random Variables

Random variables $X$ and $Y$ are independent if for all values $x$ and $y$ : $P(X = x, Y = y) = P(X = x) \cdot P(Y = y)$

Equivalently, knowing $X$ doesn't change the distribution of $Y$ .

In words: the joint probability factors into the product of individual probabilities.

What independence means

When $X$ and $Y$ are independent:

No information transfer: Observing $X$ tells you nothing about $Y$
Conditional equals unconditional: $P(Y = y | X = x) = P(Y = y)$
The joint PMF factors: $p_{X,Y}(x,y) = p_X(x) \cdot p_Y(y)$

Independence means you can analyze $X$ and $Y$ separately. Their stories don't interact.

Example: two dice

Roll two fair dice. Let $X$ = first die, $Y$ = second die.

The physical separation of the dice makes them independent. Verifying mathematically:

What do you think?

Check: P(X=3, Y=5) should equal P(X=3) × P(Y=5). Does it?

This factorization works for every pair of values. That's what makes them independent.

Calculation shortcuts

Independence gives us shortcuts that simplify calculations.

Rule 1: expectations multiply

Product of Independent RVs

If $X$ and $Y$ are independent: $E[XY] = E[X] \cdot E[Y]$

For dependent variables, $E[XY]$ would require knowing the joint distribution. For independent variables, we just multiply the means.

If X and Y are independent with E[X]=3 and E[Y]=4, what is E[XY]? (whole number)

Rule 2: variances add

Variance of Sum

If $X$ and $Y$ are independent: $\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y)$

This is why errors from independent sources add in quadrature. Uncertainties don't compound as badly as you might fear.

Variance of Sums

Var(X)=4

Var(Y)=9

2Cov=+0

Correlation (ρ): 0.0

-10+1

Var(X)+Var(Y)

2×Cov(X,Y)

0.0

Var(X+Y)

13.0

ρ = 0: Variances simply add (independent)

If Var(X)=9 and Var(Y)=16, and they're independent, what is Var(X+Y)? (whole number)

For dependent variables, we'd need a covariance term: $\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) + 2\text{Cov}(X, Y)$ .

Independence means $\text{Cov}(X, Y) = 0$ .

IID: the gold standard

In statistics, we often assume data points are "IID":

IID Random Variables

Random variables are IID (independent and identically distributed) if:

They are mutually independent
They all have the same distribution

Examples:

Coin flips are IID Bernoulli
Repeated measurements (done carefully) are IID
Random samples from a large population are approximately IID

Most statistical procedures assume IID data. When this assumption fails, results can be misleading.

Independence vs. correlation

Uncorrelated

$X$ and $Y$ are uncorrelated if $\text{Cov}(X,Y) = 0$ .

Key relationships:

Independent ⟹ Uncorrelated (always true)
Uncorrelated ⟹ Independent (NOT always true!)

Independence vs Dependence

No pattern: X and Y are independent

Counterexample: Let $X \sim \text{Uniform}(-1, 1)$ and $Y = X^2$ .

They're uncorrelated (E[XY] = E[X]E[Y] = 0), but knowing $X$ completely determines $Y$ !

Why independence matters

Simplifies calculations

Joint distributions factor
Expectations multiply
Variances add

Enables statistical inference

The Law of Large Numbers needs independence
The Central Limit Theorem needs independence
Most confidence intervals assume independent samples

Real-world consequences

Assuming independence when it's false leads to underestimating risk. The 2008 financial crisis was partly due to assuming mortgage defaults were independent (they weren't).

Summary

For Independent RVs	Formula
Joint PMF	$p(x,y) = p_X(x) \cdot p_Y(y)$
Expected product	$E[XY] = E[X] \cdot E[Y]$
Variance of sum	$\text{Var}(X+Y) = \text{Var}(X) + \text{Var}(Y)$
Covariance	$\text{Cov}(X,Y) = 0$

Independence is about information. $X$ and $Y$ are independent when learning $X$ leaves your beliefs about $Y$ unchanged.

Test your understanding

X and Y independent, E[X]=2, E[Y]=5. What is E[XY]? (whole number)

Same X,Y. Var(X)=3, Var(Y)=4. What is Var(X+Y)? (whole number)

True/False: Cov(X,Y)=0 implies X,Y independent.

What's next

You now have the core concepts of random variables: PMFs, CDFs, distributions, and independence. These form the foundation for continuous distributions, the Law of Large Numbers, and the Central Limit Theorem.