You roll a fair die. A = {1, 2, 3, 4} and B = {3, 4, 5, 6}. What is P(A or B)?
Naive probability told us what probability is: P(A)=∣A∣/∣S∣ when outcomes are equally likely. But it didn't tell us how probabilities combine. What's P(not A)? What's P(A or B)?
Three axioms answer every such question.
The three axioms
In 1933, Andrey Kolmogorov wrote down three rules. Every fact about probability follows from these.
Kolmogorov's axioms
For a sample space S and any events A,B⊆S:
Non-negativity:P(A)≥0
Normalization:P(S)=1
Additivity: If A and B are disjoint (A∩B=∅), then P(A∪B)=P(A)+P(B)
That's it. Three rules. Adjust the die weights below and watch all three axioms respond in real time.
P(Aᶜ) = 1 − P(A)
P(A∪B) = P(A) + P(B) − P(A∩B)
Kolmogorov's Axioms in Action
⚀0.167
⚁0.167
⚂0.167
⚃0.167
⚄0.167
⚅0.167
Probability distribution
0.17
0.17
0.17
0.17
0.17
0.17
⚀
⚁
⚂
⚃
⚄
⚅
✓Axiom 1:P(A) ≥ 0 for every event A
✓Axiom 2:P(S) = 1 — sum = 1.000
✓Axiom 3:Disjoint events add: each outcome is a separate “slice”
What do you think?
Why does axiom 3 require A and B to be DISJOINT? What goes wrong if they overlap?
Set notation for events
Before using the axioms, we need vocabulary for combining events. Events are sets, so set operations apply.
Set operations on events
UnionA∪B: outcomes in A or B (or both) — "A or B happens"
IntersectionA∩B: outcomes in both A and B — "A and B both happen"
ComplementAc: outcomes not in A — "A does not happen"
Disjoint (mutually exclusive): A∩B=∅ — A and B cannot both happen
Build each operation and see the result highlighted:
Set Operation Builder
A = even faces
123456
B = low faces
123456
A ∪ B = outcomes in A or B (or both)
123456
P(A ∪ B) = 5/6 = 0.833
Die roll: A = {1,2,3}, B = {2,3,4}. List A ∩ B. (e.g. {1,2})
1/3
The complement rule
The first payoff from the axioms. Since A and Ac are disjoint and together cover all of S:
Deriving the complement rule
A∪Ac=S
Every outcome is either in A or not in A. Together they cover everything.
Step 1 of 6
Complement rule
P(Ac)=1−P(A)
The probability that A does not happen equals 1 minus the probability that it does.
Roll dice and watch P(A)+P(Ac)=1 emerge empirically:
The probability of rain tomorrow is 0.3. What's the probability of no rain?
decimal, e.g. 0.7
The complement rule creates a powerful problem-solving strategy: when "at least one" is hard to calculate directly, compute "none" and subtract from 1. You'll see this trick repeatedly in the birthday paradox and quality control lessons.
The addition rule
What about P(A∪B) when A and B overlap?
What do you think?
Die roll: A = {2,4,6} (even), B = {1,2,3} (low). P(A) = 3/6, P(B) = 3/6. What is P(A ∪ B)?
The fix: subtract the overlap.
The addition rule (inclusion-exclusion)
A∪B=(A∖B)∪(A∩B)∪(B∖A)
Split the union into three disjoint pieces: A only, both, B only.
Step 1 of 4
Addition rule (inclusion-exclusion)
P(A∪B)=P(A)+P(B)−P(A∩B)
When A∩B=∅ (disjoint): P(A∪B)=P(A)+P(B) — which is just axiom 3.
Drag the sliders to adjust the Venn diagram and watch the formulas update. Try making the events disjoint (set "A ∩ B" to 0) to see the special case.
P(A) = 0.4, P(B) = 0.5, P(A ∩ B) = 0.2. What is P(A ∪ B)? (decimal, e.g. 0.7)
P(A) = 0.3, P(B) = 0.6, A and B are disjoint. What is P(A ∪ B)? (decimal, e.g. 0.9)
P(A) = 0.7, P(B) = 0.5, P(A ∪ B) = 0.9. What is P(A ∩ B)? (decimal, e.g. 0.3)
Consequences
These two rules — complement and addition — cascade into everything else.
What do you think?
Can you prove that P(∅) = 0 using the axioms?
Two more quick consequences:
Monotonicity: If A⊆B, then P(A)≤P(B). (B has everything A has, plus maybe more.)
Bounds: For any event A: 0≤P(A)≤1.
If A ⊆ B, then B = A ∪ (B\A) with A and B\A disjoint. Use additivity to show P(A) ≤ P(B).
1/2
Test your understanding
P(A) = 0.6. What is P(Aᶜ)? (decimal)
P(A) = 0.5, P(B) = 0.4, P(A ∩ B) = 0.1. What is P(A ∪ B)? (decimal)
True or False: If P(A ∪ B) = P(A) + P(B), then A and B must be disjoint. (true or false)
Die roll: A = {1,2}, B = {3,4}, C = {5,6}. What is P(A) + P(B) + P(C)? (decimal or fraction)
A deck has 52 cards. P(red) = 26/52 = 1/2. P(face card) = 12/52. P(red face card) = 6/52. What is P(red or face card)? (decimal to 3 places)
What's next
We have the rules for combining probabilities. The next lesson gives us the tools to count outcomes fast: the multiplication rule turns "how many outfits?" into a simple product.