Counting the uncountable

How many emails will you receive in the next hour? How many typos on a page? How many cars pass a checkpoint in 10 minutes?

What do you think?
A call center gets an average of 5 calls per minute. What's the probability of getting exactly 0 calls in a given minute?

From Binomial to Poisson

The Poisson distribution arises naturally from the Binomial. Imagine dividing time into tiny slices:

  1. In 1 minute, the probability of a call is high
  2. In each second (60 slices), the probability per slice is small
  3. In each millisecond (60,000 slices), the probability per slice is tiny

As we slice finer and finer, keeping np=λn \cdot p = \lambda constant, the Binomial(n,p)(n, p) converges to the Poisson(λ)(\lambda).

Binomial → Poisson
115
Slicing granularity
0
1
2
3
4
5
6
7
8
9
10
11
12
k
Binom(10, 0.3000)
Pois(3)
n
10
p
0.300000
Mean
3.000
Variance
2.100
TV distance from Poisson: 8.64%

The Poisson is the limit case: infinitely many trials, each with an infinitesimally small probability, summing to a finite rate λ\lambda.

The Poisson distribution

Poisson Distribution

A random variable XX follows a Poisson distribution with rate λ>0\lambda > 0 if: P(X=k)=eλλkk!,k=0,1,2,P(X = k) = \Large\frac{e^{-\lambda} \lambda^k}{k!}, \quad \normalsize k = 0, 1, 2, \ldots We write XPois(λ)X \sim \text{Pois}(\lambda).

The PMF has a clear structure:

  • eλe^{-\lambda} ensures the probabilities sum to 1
  • λk\lambda^k grows with kk (more events becomes more likely up to a point)
  • k!k! in the denominator eventually dominates (very high counts become rare)

Explore how the Poisson PMF shape changes with λ — notice how mean always equals variance:

Poisson Distribution Explorer
Mode
0.515
P(X = k) = e−λ · λk / k!
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
k (number of events)
E[X] = λ
4.0
Var(X) = λ
4.0
SD(X)
2.00
Mode
4
The Poisson signature: mean = variance = λ

Key properties

Poisson Mean and Variance

If XPois(λ)X \sim \text{Pois}(\lambda): E[X]=λ,Var(X)=λE[X] = \lambda, \qquad \text{Var}(X) = \lambda

Mean equals variance — this is the signature of the Poisson. If your data has mean ≈ variance, a Poisson model may be appropriate. If variance ≫ mean, look elsewhere.

Rate λ\lambdaUse case
0.5Earthquakes per year in a city
2Typos per page
5Calls per minute
100Website hits per second

Computing Poisson probabilities

If λ = 3, what is P(X = 0)? (to 3 decimal places) (decimal to 3 places, e.g. 0.456)
If λ = 3, what is P(X = 3)? (to 3 decimal places) (decimal to 3 places, e.g. 0.456)

When to use the Poisson

The Poisson is the right model when:

  1. Events occur independently: One email arriving doesn't affect the next
  2. Constant rate: The average rate doesn't change over time
  3. No simultaneous events: In a small enough time window, at most one event occurs
  4. Counting events in an interval: Time, space, or any "exposure" measure

Generate random events on a timeline at rate λ:

Poisson Timeline
120
00 events1
Observations
0
Avg count
0.00
λ
3

Classic examples:

  • Number of accidents at an intersection per month
  • Number of mutations in a stretch of DNA
  • Number of photons hitting a detector per second
  • Number of customers entering a store per hour
What do you think?
A forest has an average of 2 lightning strikes per week. What's the probability of 0 strikes in a given week?
Enter a decimal to 3 places, e.g. 0.456

The Poisson approximation to the Binomial

Poisson Approximation

When nn is large, pp is small, and λ=np\lambda = np is moderate: Binom(n,p)Pois(λ=np)\text{Binom}(n, p) \approx \text{Pois}(\lambda = np)

This is incredibly useful for computation. Instead of calculating (100003)(0.0003)3(0.9997)9997\binom{10000}{3} \cdot (0.0003)^3 \cdot (0.9997)^{9997}, just use Pois(3)\text{Pois}(3).

Rule of thumb: The approximation works well when n100n \geq 100 and p0.01p \leq 0.01.

1000 people, each with P=0.002 of a rare disease. Approx. P(exactly 2 cases)? (decimal to 3 places, e.g. 0.456)

See the convergence in action — fix λ and increase n to watch the Binomial PMF morph into the Poisson:

Binomial → Poisson Convergence
Fix λ = np and increase n to watch Bin(n, λ/n) converge to Pois(λ)
110
3200
0.0000.1020.2050.3070123456789101112
Bin(n=10, p=0.3000) Pois(λ=3)
n
10
p = λ/n
0.300000
TV distance
0.0864
The distributions differ noticeably. Slide n rightward to watch the convergence happen.

Poisson sums

The sum of independent Poissons is Poisson.

Sum of Poissons

If XPois(λ1)X \sim \text{Pois}(\lambda_1) and YPois(λ2)Y \sim \text{Pois}(\lambda_2) are independent, then: X+YPois(λ1+λ2)X + Y \sim \text{Pois}(\lambda_1 + \lambda_2)

If Store A gets 3 customers/hour and Store B gets 5 customers/hour (independently), the combined total is Poisson with rate 8/hour.

Poisson vs. other distributions

DistributionQuestionParameters
BinomialHow many successes in nn trials?nn, pp
GeometricHow many trials until first success?pp
PoissonHow many events in an interval?λ\lambda

The Poisson stands out because it models counts in continuous time or space, not a fixed number of discrete trials.

Summary

PropertyFormula
PMFP(X=k)=eλλk/k!P(X = k) = e^{-\lambda} \lambda^k / k!
MeanE[X]=λE[X] = \lambda
VarianceVar(X)=λ\text{Var}(X) = \lambda
Key featureMean = Variance
Limit ofBinom(n,p)\text{Binom}(n, p) as nn \to \infty, p0p \to 0, np=λnp = \lambda
Sum propertyPois(λ1)+Pois(λ2)=Pois(λ1+λ2)\text{Pois}(\lambda_1) + \text{Pois}(\lambda_2) = \text{Pois}(\lambda_1 + \lambda_2)

The Poisson is for rare events at a constant rate. Think: "how many times does something happen in a fixed window of time (or space)?"

Test your understanding

A website averages 10 visits/minute. What's P(exactly 10 visits in a minute)? (to 3 decimal places) (decimal to 3 places, e.g. 0.456)
Same website. What's the variance of visits per minute? (whole number)
Two independent Poisson streams with rates 4 and 7. Combined rate? (whole number)

What's next

We've now covered the main discrete distributions. Next, we cross into the continuous world — where outcomes are real numbers and probabilities become areas under curves.