Over the past few days I have been trying to understand how Markov Chain Monte Carlo (MCMC) works. In particular I have been trying to understand and implement the Metropolis-Hastings algorithm. So far I think I have an overall understanding of the algorithm but there are a couple of things that are not clear to me yet. I want to use MCMC to fit some models to data. Because of this I will describe my understanding of Metropolis-Hastings algorithm for fitting a straight line f(x)=ax to some observed data D:

1) Make an initial guess for a. Set this a as our current a (a0). Also add a at the end of the Markov Chain (C).

2) Repeat the steps bellow a number of times.

3) Evaluate current likelihood (L0) given a0 and D.

4) Propose a new a (a1) by sampling from a normal distribution with μ=a0 and σ=stepsize. For now, stepsize is constant.

5) Evaluate new likelihood (L1) given a1 and D.

6) If L1 is bigger than L0, accept a1 as the new a0, append it at the end of C and go to step 2.

7) If L1 is smaller than L0 generate a number (U) in range [0,1] from a uniform distribution

8) If U is smaller than the difference between the two likelihoods (L1 – L0), accept a1 as the new a0, append it at the end of C and go to step 2.

9) If U is bigger than the difference between the two likelihoods (L1 – L0), append the a0 at the end of C, keep using the same a0, go to step 2.

10) End of Repeat.

11) Remove some elements from the start of C (burn-in phase).

12) Now take the average of the values in C. This average is the estimated a.

Now I have some questions regarding the above steps:

- How do I construct the likelihood function for f(x)=ax but also for any arbitrary function?
- Is this a correct implementation of Metropolis-Hastings algorithm?
- How the selection of the random number generation method at Step 7 can change the results?
- How is this algorithm going to change if I have multiple model parameters? For example, if I had the model f(x)=ax+b.
Notes/Credits: The main structure of the algorithm described above is based on the code from an MPIA Python Workshop.

**Answer**

There seem to be some misconceptions about what the Metropolis-Hastings (MH) algorithm is in your description of the algorithm.

First of all, one has to understand that MH is a sampling algorithm. As stated in wikipedia

In statistics and in statistical physics, the Metropolis–Hastings algorithm is a Markov chain Monte Carlo (MCMC) method for obtaining a sequence of random samples from a probability distribution for which direct sampling is difficult.

In order to implement the MH algorithm you need a *proposal density* or *jumping distribution* Q(⋅|⋅), from which it is easy to sample. If you want to sample from a distribution f(⋅), the MH algorithm can be implemented as follows:

- Pick a initial random state x0.
- Generate a candidate x⋆ from Q(⋅|x0).
- Calculate the ratio α=f(x⋆)/f(x0).
- Accept x⋆ as a realisation of f with probability α.
- Take x⋆ as the new initial state and continue sampling until you get the desired sample size.

Once you get the sample you still need to *burn* it and *thin* it: given that the sampler works asymptotically, you need to remove the first N samples (burn-in), and given that the samples are dependent you need to subsample each k iterations (thinning).

An example in R can be found in the following link:

`http://www.mas.ncl.ac.uk/~ndjw1/teaching/sim/metrop/metrop.html`

This method is largely employed in Bayesian statistics for sampling from the posterior distribution of the model parameters.

The example that you are using seems unclear to me given that f(x)=ax is not a density unless you restrict x on a bounded set. My impression is that you are interested on fitting a straight line to a set of points for which I would recommend you to check the use of the Metropolis-Hastings algorithm in the context of linear regression. The following link presents some ideas on how MH can be used in this context (Example 6.8):

There are also lots of questions, with pointers to interesting references, in this site discussing about the meaning of likelihood function.

Another pointer of possible interest is the R package `mcmc`

, which implements the MH algorithm with Gaussian proposals in the command `metrop()`

.

**Attribution***Source : Link , Question Author : AstrOne , Answer Author : Scortchi – Reinstate Monica*