Where do the full conditionals come from in Gibbs sampling?

MCMC algorithms like Metropolis-Hastings and Gibbs sampling are ways of sampling from the joint posterior distributions.

I think I understand and can implement metropolis-hasting pretty easily–you simply choose starting points somehow, and ‘walk the parameter space’ randomly, guided by the posterior density and proposal density. Gibbs sampling seems very similar but more efficient since it updates only one parameter at a time, while holding the others constant, effectively walking the space in an orthogonal fashion.

In order to do this, you need the full conditional of each parameter in analytical from*. But where do these full conditionals come from?

To get the denominator you need to marginalize the joint over $x_1$. That seems like a whole lot of work to do analytically if there are many parameters, and might not be tractable if the joint distribution isn’t very ‘nice’. I realize that if you use conjugacy throughout the model, the full conditionals may be easy, but there’s got to be a better way for more general situations.

All the examples of Gibbs sampling I’ve seen online use toy examples (like sampling from a multivariate normal, where the conditionals are just normals themselves), and seem to dodge this issue.

* Or do you need the full conditionals in analytical form at all? How do programs like winBUGS do it?

The intuition for this is as follows, in most “realistic” joint distributions $P(X_1,\dots,X_n)$, most of the $X_i$‘s are generally conditionally independent of most of the other random variables. That is to say, some of the variables have local interactions, say $X_i$ depends on $X_{i-1}$ and $X_{i+1}$, but doesn’t interact with everything, hence the conditional distributions should simplify significantly as $Pr(X_i|X_1, \dots, X_i) = Pr(X_i|X_{i-1}, X_{i+1})$