DIY Metropolis-Hastings
This post walks through a simple implementation of the Metropolis-Hastings algorithm.
Background
If you havenβt read my previous post on pseudorandom number generation, be sure to check it out since weβll be reusing the pseudorandom number generator discussed there.
Metropolis-Hastings is a Markov chain Monte Carlo technique to sample from distributions for which we donβt know how to perform direct sampling. In particular, we can use this algorithm to sample from a distribution without a known normalization constant.
For this post, weβll sample from a Gaussian distribution without needing to know the Gaussian normalization constant of .
Here is the high-level summary of how we will accomplish this:
- Set initial sample (this is arbitrary)
- Propose new sample
- Use which is proportional to target distribution to probabilistically accept or reject over
- Set equal to the value of or based on the outcome of step 3.
- Repeat steps 2-4 to determine the next sample, , and continue repeating for .
Sample Proposal
Of the 5 steps above, steps 2 and 3, sample proposal and sample selection, are the least straightforward and will be explained in this section and the next. In the sample proposal step, we generate a new sample which might be returned as the next output from the algorithm while in the sample selection step we probabilistically return either the newly proposed sample or the sample which was returned in the previous iteration of the algorithm.
Sample proposal can be straight forward, and for this tutorial weβll use a very simple method. Given a previous sample value of , the proposed sample will have a value of
where is a continuous uniform random variable sampled between and . For the first iteration of Metropolis-Hastings, will be an arbitrary value such as but should ideally have a probability for the target distribution.
The equivalent Python code for sample proposal is simple:
def propose_sample(current_sample: float) -> float:
return current_sample + uniform() - 0.5
proposed_sample = propose_sample(x0)
uniform()
returns a continuous random variable with values between and
(), and the
code behind it can be read about here.
Sample Selection
In step 3, sample selection, we probabilistically choose either or to be the next sample value .
In order to perform sample selection, we need a function which is proportional to our distribution of interest . I.e. such that:
Since the target distribution from which weβd like to sample is the Gaussian distribution , weβll choose .
To determine whether takes the value of or , we sample a value between 0 and 1 from a continuous uniform distribution and let if else .
This logic can be seen in the following Python code:
def score(x: float, mu: float = 0, sigma: float = 1) -> float:
norm_x = (x - mu) / sigma
return math.exp(-(norm_x ** 2) / 2)
def get_next_sample(current_sample: float, current_sample_score: float) -> Tuple[float, float]:
# Calculate proposed value for x_{t+1}
proposed_sample = propose_sample(current_sample)
proposed_sample_score = score(proposed_sample)
# NOTE: This code was written with simplicity in mind, but there is no reason to
# sample from the uniform distribution if proposed_sample_score > self.current_sample_score
if uniform() <= (proposed_sample_score / current_sample_score):
current_sample = proposed_sample
current_sample_score = proposed_sample_score
return current_sample, current_sample_score
# x0 is the arbitrary starting point
x0 = 0
x0_score = score(x0)
x1, x1_score = get_next_sample(x0, x0_score)
x2, x2_score = get_next_sample(x1, x1_score)
Visualization
Lastly, weβll implement a stateful class to generate samples:
class MetropolisHastings:
def __init__(self, x0: float = 0) -> None:
self.sample = x0
self.sample_score = score(self.sample)
def __call__(self) -> float:
self.sample, self.sample_score = get_next_sample(self.sample, self.sample_score)
return self.sample
Now we can generate a set of samples and make sure the distribution looks as expected.
def gen_samples(f: Callable[[], float]) -> Iterator[float]:
for _ in tqdm(range(1000000)):
yield f()
metropolis = MetropolisHastings()
gaussian_samples = list(gen_samples(metropolis))
Sure enough, our Gaussian samples look pretty good.
Thatβs all for this post! If you have any feedback, please feel free to reach out.