From any generic sampling algorithm, one can derive an optimization algorithm.

Indeed, to maximize an arbitrary function f:x→f(x), it suffices to draw samples from g∼ef/T. For T small enough, these samples will fall near the global maximum (or local maxima in practice) of the function f.

By “sampling” I mean, drawing a pseudo-random sample from a distribution given a log-likelihood function known up to a constant. For instance, MCMC sampling, Gibbs sampling, Beam Sampling, etc. By “optimization” I mean the attempt to find parameters maximizing the value of a given function.

Is the reverse possible?

Given a heuristic to find the maximum of a function or a combinatorial expression, can we extract an efficient sampling procedure?HMC for instance seems to take advantage of gradient information. Can we construct a sampling procedure that takes advantage of a BFGS-like approximation of the Hessian?

(edit: apparently yes: http://papers.nips.cc/paper/4464-quasi-newton-methods-for-markov-chain-monte-carlo.pdf)

We can use MCTS in combinatorial problems, can we translate that into a sampling procedure?Context: a difficulty in sampling is often that most of the mass of the probability distribution lies within a very small region. There are interesting techniques to find such regions, but they do not directly translate into unbiased sampling procedures.

Edit: I now have a lingering feeling that the answer to that question is somewhat equivalent to the equality of complexity classes #P and NP, making the answer a likely “no”. It does explain why every sampling technique yields an optimization technique but not vice versa.

**Answer**

One connection has been brought up by Max Welling and friends in these two papers:

- Bayesian Learning via Stochastic Gradient Langevin Dynamics
- Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring.

The gist is that the “learning”, ie. optimisation of a model smoothly transitions into sampling from the posterior.

**Attribution***Source : Link , Question Author : Arthur B. , Answer Author : bayerj*