The robot explorer

You want to sample from a complex distribution, but you can't invert the CDF or use any standard method. What if you sent a "robot" to wander the target density, spending more time in high-probability regions?

What do you think?

A robot explores a 1D density surface where height = probability density. It proposes random steps. Should it always accept steps to lower ground?

The algorithm

Metropolis-Hastings Algorithm

To sample from a target distribution $f(x)$ (known up to a normalizing constant):

Start at some $x_0$
Propose a move: $x^* \sim q(x^* \mid x_t)$
Accept with probability $\alpha = \min\left(1, \frac{f(x^*) \, q(x_t \mid x^*)}{f(x_t) \, q(x^* \mid x_t)}\right)$
If accepted, $x_{t+1} = x^*$ . Otherwise, $x_{t+1} = x_t$ .
Repeat.

You don't need to know the normalizing constant. If $f(x) = c \cdot g(x)$ , the constant $c$ cancels in the ratio $f(x^*)/f(x_t)$ , which is what makes MCMC practical for Bayesian inference.

Watch the robot explore

The robot wanders the target density. Its trail of dots forms the target distribution. Notice how it spends more time on peaks and less time in valleys:

Metropolis-Hastings Explorer

Proposal σ1.0

Samples

Accept rate

—%

Current x

3.50

Why does it work?

The acceptance probability is carefully designed so that the resulting Markov chain satisfies detailed balance with respect to $f$ :

Detailed balance proof sketch

f(x) \, q(y|x) \, \alpha(x \to y) = f(y) \, q(x|y) \, \alpha(y \to x)

Want: flow from x to y equals flow from y to x

Step 1 of 4

Symmetric proposals

When the proposal distribution is symmetric ( $q(y|x) = q(x|y)$ , e.g., a random walk), the acceptance simplifies to:

$\alpha = \min\left(1, \frac{f(x^*)}{f(x_t)}\right)$

This is the original Metropolis algorithm (1953).

Burn-in: The first several hundred (or thousand) samples are biased by the starting point. Discard them before using the samples for inference.

Tuning the proposal

Proposal Step Size	Acceptance Rate	Exploration
Too small	Very high (~95%)	Slow — takes forever to explore
Too large	Very low (~5%)	Gets stuck — most proposals rejected
Just right	~25-50%	Efficient exploration

Practice problems

In Metropolis-Hastings, if f(proposed) > f(current), what is the acceptance probability? (whole number)

If f(proposed)/f(current) = 0.3, you accept the move with probability ___. (decimal to 1 place, e.g. 0.8)

Summary

Concept	Key Idea
Goal	Sample from $f(x)$ without knowing the normalizing constant
Proposal	Suggest $x^$ from $q(x^ \mid x)$
Accept/reject	$\alpha = \min(1, f(x^*)q(x
Why it works	Detailed balance → $f$ is the stationary distribution
Burn-in	Discard initial samples

MCMC turns a sampling problem into a simulation problem. You don't need to solve any equations — just run the chain long enough and collect the dots.

What's next

Metropolis-Hastings updates all coordinates at once. Gibbs sampling updates one coordinate at a time — often much more efficient for multivariate distributions.