Introduction to frequentist statistics for Bayesians [closed]

I’m a simple minded Bayesian who feels comfortable in the cosy world of Bayes.

However, due to malevolent forces outside my control, I now have to do introductory graduate courses about the exotic and weird world of frequentist statistics. Some of these concepts seem very weird to me, and my teachers are not versed in Bayes, so I thought I’d get some help on the internet from those who understand both.

How would you explain the different concepts in frequentist statistics to a Bayesian who finds frequentism weird and uncomfortable?

For example, some things I already understand:

  • The maximum likelihood estimator argmaxθp(D|θ) is equal to the maximum posterior estimator argmaxθp(θ|D), if p(θ) is flat.
  • (not entirely sure about this one). If a certain estimator ˆθ is a sufficient statistic for a parameter θ, and p(θ) is flat, then p(ˆθ|θ)=c1p(D|θ)=c1c2p(θ|D), i.e. the sampling distribution is equal to the likelihood function, and therefore equal to the posterior of the parameter given a flat prior.

Those are examples of explaining frequentist concepts to someone who understands Bayesian ones.

How would you similarly explain the other central concepts of frequentist statistics in terms a Bayesian can understand?

Specifically, I’m interested in the following questions:

  • What is the role of Mean Square Error? How does it relate to Bayesian loss functions?
  • How does the criterion of “unbiasedness” relate to Bayesian criteria? I know that a Bayesian will not demand that its estimators are unbiased, but at the same time, a Bayesian would probably agree that an unbiased frequentist estimator is generally more desirable than a biased frequentist one (even though he would consider both to be inferior to the Bayesian estimator). So how does a Bayesian understand unbiasedness?
  • If we have flat priors, do frequentist confidence intervals somehow coincide with Bayesian ones?
  • What in the name of Laplace is going on with specification tests like the F test? Is this some degenerate special case of a Bayesian update on the distribution over model space?

More generally:

Is there some resource that explains frequentism to Bayesians?
Most of the books run the other way around: they explain Bayesianism to people who are experienced in frequentist statistics.


ps. I have looked, and while there are a lot of questions already about the difference between Bayesian and Frequentism, none explicitly explain Frequentism from the perspective of a Bayesian.

This question is related, but is not specifically about explaining Frequentist concepts to a Bayesian (more about justifying frequentist thinking in general).

Also, my point is not to bash frequentism. I really do want to understand it better

Answer

Actually many of the things mentioned by you are already discussed by the major Bayesian handbooks. In many cases those handbooks are written for frequentists by training, so they discuss many similarities and try translating the frequentist methods into Bayesian ground. One example is the Doing Bayesian Data Analysis book by John K. Kruschke or his paper translating t-test into Bayesian ground. There is also another psychologist, Eric-Jan Wagenmakers who with his team talked a lot about translating frequentist concepts into Bayesian ground. Decision-theoretic concepts like loss functions, unbiassness etc. are discussed in the The Bayesian Choice book by Christian P. Robert.

Moreover, some of the concepts mentioned by you are not really Bayesian. For example, loss function is a general concept and only if you combine it with prior distribution you get a Bayes risk.

It is also worth mentioning that even if you are self-declared Bayesian, then you probably already use a lot of frequentist methods. For example, if you use MCMC for estimation and then calculate mean of the MCMC chain as your point estimate, then you are using a frequentist estimator, since you are not using any Bayesian model and priors to get the estimate of the mean of the MCMC chain.

Finally, some frequentist concepts and tools are not easily translatable to Bayesian setting, or the proposed “equivalents” are rather proofs of concept, then something that you’d use in real life. In many cases the approaches are simply different and looking for parallels is a waste of time.

Attribution
Source : Link , Question Author : user56834 , Answer Author : Tim

Leave a Comment