I currently try to understand Likelihood Principle and I frankly don’t get it at all. So, I will write all my question as a list, even if those might be pretty basic questions.

• What exactly does “all of the information” phrase mean in the context of this principle? (as in all of the information in a sample is contained in the likelihood function.)
• Is the principle somehow connected to the very provable fact, that \$p(x|y)\propto p(y|x)p(x)\$? Is the “likelihood” in the principle the same thing, as \$p(y|x)\$, or not?
• How can a mathematical theorem be “controversial”? My (weak) understanding of math is that a theorem is either proven, or is not proven. To what category does Likelihood Principle fall?
• How is the Likelihood Principle important for Bayesian inference, which is based on \$p(x|y)\propto p(y|x)p(x)\$ formula?

The likelihood principle has been stated in many different ways, with variable meaning and intelligibility. A.W.F. Edwards’s book Likelihood is both an excellent introduction to many aspects of likelihood and still in print. This is how Edwards defines the likelihood principle:

“Within the framework of a statistical model, all of the information which the data provide concerning the relative merits of two hypotheses is contained in the likelihood ratio of those hypotheses.” (Edwards 1972, 1992 p. 30)