Where do the full conditionals come from in Gibbs sampling?

MCMC algorithms like Metropolis-Hastings and Gibbs sampling are ways of sampling from the joint posterior distributions.

I think I understand and can implement metropolis-hasting pretty easily–you simply choose starting points somehow, and ‘walk the parameter space’ randomly, guided by the posterior density and proposal density. Gibbs sampling seems very similar but more efficient since it updates only one parameter at a time, while holding the others constant, effectively walking the space in an orthogonal fashion.

In order to do this, you need the full conditional of each parameter in analytical from*. But where do these full conditionals come from?
P(x1|x2, , xn)=P(x1, , xn)P(x2, , xn)
To get the denominator you need to marginalize the joint over x1. That seems like a whole lot of work to do analytically if there are many parameters, and might not be tractable if the joint distribution isn’t very ‘nice’. I realize that if you use conjugacy throughout the model, the full conditionals may be easy, but there’s got to be a better way for more general situations.

All the examples of Gibbs sampling I’ve seen online use toy examples (like sampling from a multivariate normal, where the conditionals are just normals themselves), and seem to dodge this issue.

* Or do you need the full conditionals in analytical form at all? How do programs like winBUGS do it?

Answer

Yes, you are right, the conditional distribution needs to be found analytically, but I think there are lots of examples where the full conditional distribution is easy to find, and has a far simpler form than the joint distribution.

The intuition for this is as follows, in most “realistic” joint distributions P(X1,,Xn), most of the Xi‘s are generally conditionally independent of most of the other random variables. That is to say, some of the variables have local interactions, say Xi depends on Xi1 and Xi+1, but doesn’t interact with everything, hence the conditional distributions should simplify significantly as Pr(Xi|X1,,Xi)=Pr(Xi|Xi1,Xi+1)

Attribution
Source : Link , Question Author : cespinoza , Answer Author : gabgoh

Leave a Comment