Likelihood-free inference – what does it mean?

Recently I have become aware of ‘likelihood-free’ methods being bandied about in literature. However I am not clear on what it means for an inference or optimization method to be likelihood-free.

In machine learning the goal is usually to maximise the likelihood of some parameters to fit a function e.g. the weights on a neural network.

So what exactly is the philosophy of a likelihood-free approach and why do adversarial networks such as GANs fall under this category?


There are many examples of methods not based on likelihoods in statistics (I don’t know about machine learning). Some examples:

  1. Fisher’s pure significance tests. Based only on a sharply defined null hypothesis (such as no difference between milk first and milk last in the Lady Tasting Tea experiment. This assumption leads to a null hypothesis distribution, and then a p-value. No likelihood involved. This minimal inferential machinery cannot in itself give a basis for power analysis (no formally defined alternative) or confidence intervals (no formally defined parameter).

  2. Associated to 1. is randomization tests Difference between Randomization test and Permutation test, which in its most basic form is a pure significance test.

  3. Bootstrapping is done without the need for a likelihood function. But there are connections to likelihood ideas, for instance empirical likelihood.

  4. Rank-based methods don’t usually use likelihood.

  5. Much of robust statistics.

  6. Confidence intervals for the median (or other quantiles) can be based on order statistics. No likelihood is involved in the calculations. Confidence interval for the median, Best estimator for the variance of the empirical median

  7. V Vapnik had the idea of transductive learning which seems to be related to as discussed in the Black Swan Taleb and the Black Swan.

  8. In the book Data Analysis and Approximate Models Laurie Davis builds a systematic theory of statistical models as approximations, confidence intervals got replaced by approximation intervals, and there are no parametric families of distributions, no N(μ,σ2) only N(9.37,2.122) and so on. And no likelihoods.

At the moment you got a likelihood function, there is an immense machinery to build on. Bayesians cannot do without, and most others do use likelihood most of the time. But it is pointed out in a comment that even Bayesians try to do without, see Approximate_Bayesian_computation. There is even a new text on that topic.

But where do they come from? To get a likelihood function in the usual way, we need a lot of assumptions which can be difficult to justify.

It is interesting to ask if we can construct likelihood functions, in some way, from some of this likelihood-free methods. For instance, point 6. above, can we construct a likelihood function for the median from (a family of) confidence intervals calculated from order statistics? I should ask that as a separate question …

Your last question about GAN’s I must leave for others.

Source : Link , Question Author : piccolo , Answer Author : kjetil b halvorsen

Leave a Comment