The vote count

An election just ended. Candidate A got 7 votes. Candidate B got 3 votes. A wins, obviously.

But here's the question competition organizers love: if you shuffle the ballots and count them one at a time, what's the probability A is ahead after every single ballot?

A must lead at every step of the count, not just at the end.

What do you think?
With 7 votes for A and 3 for B, what's the probability A leads at every step of the count?

Simulate it. Shuffle the ballots and watch A's lead over time:

Ballot Problem
Theory: P(A always leads) = (a - b) / (a + b) = (7 - 3) / 10 = 40.0%

After many trials, the fraction where A leads at every step converges to (ab)/(a+b)(a - b) / (a + b).

A random walk in disguise

Each ballot counted is a step. An A-ballot moves the tally up by 1. A B-ballot moves it down by 1. The count traces a random walk starting at 0, ending at aba - b.

Random Walk
0steppos
Position: 0

The ballot problem asks: among all the walks from 0 to aba - b using aa up-steps and bb down-steps, what fraction never touch or cross zero?

The Ballot Theorem (Bertrand, 1887)

If candidate A receives aa votes and candidate B receives bb votes with a>ba > b, the probability that A is strictly ahead of B throughout the counting is:

P(A always leads)=aba+bP(\text{A always leads}) = \frac{a - b}{a + b}

What do you think?
A wins 100 to 1. What's the probability A led the entire count?
decimal like 0.98

Catalan numbers: counting valid paths

Zoom out from the ballot problem for a moment. Consider paths on a grid: you take nn steps right-and-up and nn steps right-and-down, for 2n2n steps total, starting and ending at height 0. How many of these paths never dip below the x-axis?

These are called Dyck paths, and the answer involves one of the most famous sequences in combinatorics.

Catalan Number

The nnth Catalan number counts the number of paths from (0,0)(0,0) to (2n,0)(2n, 0) using up-steps (+1)(+1) and down-steps (1)(-1) that never go below zero:

Cn=1n+1(2nn)C_n = \frac{1}{n+1}\binom{2n}{n}

The first few values: C0=1,C1=1,C2=2,C3=5,C4=14,C5=42C_0 = 1, C_1 = 1, C_2 = 2, C_3 = 5, C_4 = 14, C_5 = 42.

Generate random paths and see how many stay non-negative:

Catalan Paths
C(4) = 14 valid paths out of C(8,4) = 70 total = 20.0%

The fraction of valid paths matches Cn/(2nn)=1/(n+1)C_n / \binom{2n}{n} = 1/(n+1). But how do you prove that?

The reflection principle

Here's the argument.

Take any "bad" path, one that dips below zero. Find the first point where it hits 1-1. Now reflect everything after that point across the line y=1y = -1. Every up-step becomes a down-step and vice versa.

Reflection Principle

Every "bad" path (solid) that dips below zero corresponds to a unique reflected path (dashed) ending at -2. This bijection counts the bad paths exactly.

0-1
Original Reflected

This reflection creates a bijection (a one-to-one correspondence) between:

  • Bad paths from (0,0)(0,0) to (2n,0)(2n, 0) that touch y=1y = -1
  • All paths from (0,0)(0,0) to (2n,2)(2n, -2)
Counting with Reflection
Total paths: (2nn)\text{Total paths: } \binom{2n}{n}
Choose which n of the 2n steps go up. The rest go down.
Step 1 of 5

The reflection principle turns a hard counting problem ("paths that avoid y=1y = -1") into an easy one ("paths ending at a different point"). This is a bijection, a story proof in disguise.

Where these ideas appear

The ballot problem and Catalan numbers come up everywhere in competitions and programming:

Math competitions:

  • IMO 2001 Problem 6 used lattice path arguments
  • Putnam regularly features ballot-type counting problems
  • USAMO/AIME problems on counting paths in grids

Programming contests:

  • Codeforces/AtCoder problems on counting valid bracket sequences (Catalan numbers count these too)
  • LeetCode #22 "Generate Parentheses" is the Catalan number in disguise
  • Google Code Jam has included random walk probability problems

The Catalan number counts all of these:

ObjectConnection
Valid bracket sequences of length 2n2nOpen = up, close = down
Binary trees with nn nodesLeft child = up, right child = down
Triangulations of an (n+2)(n+2)-gonEach diagonal = a step
Stack-sortable permutations of nnPush = up, pop = down
What do you think?
How many ways can you arrange 4 opening and 4 closing parentheses into a valid expression like ((())())?
whole number
Candidate A gets 10 votes, B gets 6. P(A always leads)? (decimal like 0.25)
1/3

The takeaway

Random walks reveal structure in sequential processes. The ballot problem's (ab)/(a+b)(a-b)/(a+b) formula, the reflection principle's bijection trick, and Catalan numbers' ubiquity all stem from one idea: counting lattice paths with constraints. When a competition problem involves sequential choices that must satisfy some running condition, think walks, think reflections, think Catalan.