The crowded room
A classroom has 30 students. What's the probability that two of them share a birthday?
Drag the slider to see how quickly the probability climbs.
365 possible birthdays. Do any two people share one?
chance two people share a birthday
23 people = 253 pairs to compare (50% threshold)
Try it yourself. Add students to a classroom and watch for the first birthday match:
Run many trials to see the empirical frequency converge to theory:
At just 23 people, you hit 50%. By 30 people, it's over 70%. By 50 people, it's nearly certain.
Why your intuition fails
You might reason: "With 365 days, I need about half (maybe 183 people) for a 50% chance." But that answers the wrong question.
With 23 people, you're not checking 23 possibilities. You're checking pairs. That's 253 chances for a match.
The number of pairs grows much faster than the number of people. Doubling the people quadruples the pairs.
Add people to a room and watch the number of pairs explode:
See the mesh of connections grow:
The math: counting "no collision"
Rather than calculate every way two people could match, flip the question: what's the probability nobody shares a birthday?
The square root law
The collision threshold follows a simple pattern:
For possible values (buckets, birthdays, hash outputs), you need roughly items for a 50% collision probability.
For 365 days: people.
This is called the birthday attack in cryptography, and it's why security assumptions often fail.
Why this breaks security
Consider a 32-bit hash function. It has billion possible outputs. Sounds safe?
Collision thresholds by system
| System | Possible Values | 50% Collision At |
|---|---|---|
| Birthdays | 365 | 23 people |
| 16-bit hash | 65,536 | 302 items |
| 32-bit hash | 4.3 billion | 77,000 items |
| 64-bit hash | 2⁶⁴ | ~5 billion items |
Pick a system or enter a custom hash size to see how many items produce a collision at any probability:
This is why MD5 (128-bit) is broken for security. Attackers need "only" attempts to find a collision, which is computationally feasible with enough resources.
Modern cryptographic hashes use 256 bits or more. With buckets, you'd need items for a 50% collision. That's beyond any conceivable computation.
The general formula
For a target collision probability :
For :
For :
Even for near-certain collisions, you only need about items. The square root dominates.
The deeper pattern
The birthday paradox is really about pairs.
When you have items, you have pairs. That grows as , not .
This quadratic growth explains why "unlikely" collisions happen so quickly. You're getting chances for a match, not just .
Any time you ask "do two things match?", you're counting pairs. And pairs grow quadratically.
Watch the race between people and pairs. Drag the slider and see quadratic growth in action: