The Conditional Lens

What do you think?
A test for a rare disease is 99% accurate. You test positive. What's the probability you actually have the disease?

The setup

Here are the numbers:

  • The disease affects 1 in 1,000 people (0.1% prevalence)
  • The test is 99% accurate:
    • If you have the disease, it correctly says "positive" 99% of the time
    • If you're healthy, it correctly says "negative" 99% of the time

You take the test. It comes back positive.

What do you think?
Out of 100,000 people tested, about how many have the disease?
Enter a whole number

See it for yourself

Medical Test Simulator
0.1%10%
50%99.9%
True Positives
99
False Positives
999
P(Disease | Positive Test)
9.0%
99 / 1098 positive tests
What do you think?
The test catches 99 of the 100 sick people. But it also flags 1% of the 99,900 healthy people as positive. How many healthy people test positive?
Enter a whole number

You test positive. You're either one of the 99 true positives or one of the 999 false positives.

Your Actual Risk
True positives=99\text{True positives} = 99
99% of the 100 sick people test positive.
Step 1 of 4

This is the base rate fallacy. We ignore how rare something is and focus only on the test accuracy.

What is conditional probability?

When you learned the test was positive, you didn't change the world, you changed what you know about it. The sample space shrinks.

Conditional Probability

The conditional probability of A given B is: P(AB)=P(AB)P(B)P(A|B) = \Large\frac{P(A \cap B)}{P(B)} provided P(B)>0P(B) > 0.

P(A|B) Explorer
SBA∩B
0.051
00.40

P(A|B)

30.0%

P(A|B) = 0.12 / 0.40

P(B)
0.40
P(A∩B)
0.12
P(A|B)
0.30
If P(A ∩ B) = 0.12 and P(B) = 0.4, what is P(A|B)? (decimal, e.g. 0.42)
If P(A|B) = 0.6 and P(B) = 0.5, what is P(A ∩ B)? (decimal, e.g. 0.42)

The shrinking universe

Conditioning is like applying a filter. When you learn B happened, outcomes where B didn't happen become impossible.

The Filter

|A| = 54, |B| = 41, |A ∩ B| = 24

Notice: the denominator changes from "all outcomes" to "just outcomes in B." You're zooming in.

The multiplication rule

Deriving the Multiplication Rule
P(AB)=P(AB)P(B)P(A|B) = \frac{P(A \cap B)}{P(B)}
Start with the definition of conditional probability.
Step 1 of 3

Read it as: "The probability that both A and B happen equals the probability B happens, times the probability A happens given B happened."

P(Rain) = 0.3 and P(Umbrella | Rain) = 0.9. What is P(Rain and Umbrella)? (decimal, e.g. 0.42)

Example: drawing cards

What's the probability of drawing two aces in a row from a deck (without replacement)?

What do you think?
You drew an ace. What's the probability the next card is also an ace?
Probability Tree
52AA
Start: 52 cards, 4 aces
1/4
Two Aces in a Row
P(first ace)=452P(\text{first ace}) = \frac{4}{52}
4 aces out of 52 cards.
Step 1 of 3

The multiplication rule extends to chains: P(A1A2A3)=P(A1)P(A2A1)P(A3A1A2)P(A_1 \cap A_2 \cap A_3) = P(A_1) \cdot P(A_2|A_1) \cdot P(A_3|A_1 \cap A_2)

Chain Rule Builder
Draw aces one at a time, without replacement. Watch each conditional probability shrink.

Common mistakes

What do you think?
A spam filter knows that 80% of spam emails contain the word 'FREE'. An email contains 'FREE'. What probability does the filter need to make a good decision?
What do you think?
A prosecutor says: 'The probability of this DNA match if the defendant is innocent is 1 in a million.' What does the jury actually need to know?
P(A|B) vs P(B|A)
AB
Zoom into B
P(A|B) = 0.24
AB
Zoom into A
P(B|A) = 0.30
Differ by 0.06 — same numerator, different denominators

Test your understanding

P(A) = 0.4, P(B) = 0.5, P(A ∩ B) = 0.2. What is P(A|B)? (decimal, e.g. 0.42)
P(A) = 0.4, P(B) = 0.5, P(A ∩ B) = 0.2. What is P(B|A)? (decimal, e.g. 0.42)
A bag has 3 red and 2 blue balls. You draw two without replacement. What's P(both red)? (decimal, e.g. 0.42)
True or False: P(A|B) is always equal to P(B|A) (true or false)

What's next

Rearranging the conditional probability definition gives you P(A ∩ B) = P(B) · P(A|B). This is the Multiplication Rule — the workhorse for computing joint probabilities, and the first step toward Bayes' Rule.