DIY Metropolis-Hastings

This post walks through a simple implementation of the Metropolis-Hastings algorithm.

Background

If you haven’t read my previous post on pseudorandom number generation, be sure to check it out since we’ll be reusing the pseudorandom number generator discussed there.

Metropolis-Hastings is a Markov chain Monte Carlo technique to sample from distributions for which we don’t know how to perform direct sampling. In particular, we can use this algorithm to sample from a distribution without a known normalization constant.

For this post, we’ll sample from a Gaussian distribution without needing to know the Gaussian normalization constant of $\frac 1 {\sigma \sqrt {2 \pi}}$ .

Here is the high-level summary of how we will accomplish this:

Set initial sample $x_0 = 0$ (this is arbitrary)
Propose new sample $x_{proposed}$
Use $g(x)$ which is proportional to target distribution $f(x)$ to probabilistically accept or reject $x_{proposed}$ over $x_0$
Set $x_1$ equal to the value of $x_0$ or $x_{proposed}$ based on the outcome of step 3.
Repeat steps 2-4 to determine the next sample, $x_2$ , and continue repeating for $x_3, x_4, ..., x_N$ .

Sample Proposal

Of the 5 steps above, steps 2 and 3, sample proposal and sample selection, are the least straightforward and will be explained in this section and the next. In the sample proposal step, we generate a new sample which might be returned as the next output from the algorithm while in the sample selection step we probabilistically return either the newly proposed sample or the sample which was returned in the previous iteration of the algorithm.

Sample proposal can be straight forward, and for this tutorial we’ll use a very simple method. Given a previous sample value of $x_t$ , the proposed sample will have a value of

x_{proposed} = x_t + \mathcal{U}_{[- \frac 1 2,\frac 1 2]}

where $\mathcal{U}_{[- \frac 1 2,\frac 1 2]}$ is a continuous uniform random variable sampled between $- \frac 1 2$ and $\frac 1 2$ . For the first iteration of Metropolis-Hastings, $x_0$ will be an arbitrary value such as $0$ but should ideally have a probability $\gt 0$ for the target distribution.

The equivalent Python code for sample proposal is simple:

def propose_sample(current_sample: float) -> float:
    return current_sample + uniform() - 0.5


proposed_sample = propose_sample(x0)

uniform() returns a continuous random variable with values between $0$ and $1$ ( $\mathcal{U}_{[0,1]}$ ), and the code behind it can be read about here.

Sample Selection

In step 3, sample selection, we probabilistically choose either $x_{proposed}$ or $x_t$ to be the next sample value $x_{t+1}$ .

In order to perform sample selection, we need a function $g(x)$ which is proportional to our distribution of interest $f(x)$ . I.e. $g(x) \propto f(x)$ such that:

\frac {f(a)} {f(b)} = \frac {cg(a)} {cg(b)} = \frac {g(a)} {g(b)}

Since the target distribution from which we’d like to sample is the Gaussian distribution $f(x)=\frac 1 {\sigma \sqrt {2 \pi}} e^{-\frac 1 2 ({\frac {x - \mu} \sigma})^2}$ , we’ll choose $g(x)=e^{-\frac 1 2 ({\frac {x - \mu} \sigma})^2}$ .

To determine whether $x_{t+1}$ takes the value of $x_t$ or $x_{proposed}$ , we sample a value $u$ between 0 and 1 from a continuous uniform distribution $\mathcal{U}_{[0,1]}$ and let $x_{t+1}=x_t$ if $u \leq \frac {g(x_t)} {g(x_{proposed})}$ else $x_{t+1}=x_{proposed}$ .

This logic can be seen in the following Python code:

def score(x: float, mu: float = 0, sigma: float = 1) -> float:
    norm_x = (x - mu) / sigma
    return math.exp(-(norm_x ** 2) / 2)


def get_next_sample(current_sample: float, current_sample_score: float) -> Tuple[float, float]:
    # Calculate proposed value for x_{t+1}
    proposed_sample = propose_sample(current_sample)
    proposed_sample_score = score(proposed_sample)

    # NOTE: This code was written with simplicity in mind, but there is no reason to
    # sample from the uniform distribution if proposed_sample_score > self.current_sample_score
    if uniform() <= (proposed_sample_score / current_sample_score):
        current_sample = proposed_sample
        current_sample_score = proposed_sample_score

    return current_sample, current_sample_score


# x0 is the arbitrary starting point
x0 = 0
x0_score = score(x0)

x1, x1_score = get_next_sample(x0, x0_score)
x2, x2_score = get_next_sample(x1, x1_score)

Visualization

Lastly, we’ll implement a stateful class to generate samples:

class MetropolisHastings:
    def __init__(self, x0: float = 0) -> None:
        self.sample = x0
        self.sample_score = score(self.sample)

    def __call__(self) -> float:
        self.sample, self.sample_score = get_next_sample(self.sample, self.sample_score)
        return self.sample

Now we can generate a set of samples and make sure the distribution looks as expected.

def gen_samples(f: Callable[[], float]) -> Iterator[float]:
    for _ in tqdm(range(1000000)):
        yield f()

metropolis = MetropolisHastings()
gaussian_samples = list(gen_samples(metropolis))

Sure enough, our Gaussian samples look pretty good.

_config.yml

That’s all for this post! If you have any feedback, please feel free to reach out.

Published on 3 March 2022

If you enjoy this blog, please consider supporting me by buying one of my books. 📚
(click here)