# Under which conditions do Bayesian and frequentist point estimators coincide?

With a flat prior, the ML (frequentist — maximum likelihood) and the MAP (Bayesian — maximum a posteriori) estimators coincide.

More generally, however, I’m talking about point estimators derived as the optimisers of some loss function. I.e.

$$ˆx(.)=argminE(L(X−ˆx(y))|y) (Bayesian) \hat x(\,. ) = \text{argmin} \; \mathbb{E} \left( L(X-\hat x(y)) \; | \; y \right) \qquad \; \,\text{ (Bayesian) }$$
$$ˆx(.)=argminE(L(x−ˆx(Y))|x)(Frequentist) \hat x(\,. ) = \text{argmin} \; \mathbb{E} \left( L(x-\hat x(Y)) \; | \; x \right) \qquad \text{(Frequentist)}$$

where $$E\mathbb{E}$$ is the expectation operator, $$LL$$ is the loss function (minimised at zero), $$ˆx(y)\hat x(y)$$ is the estimator, given the data $$yy$$, of the parameter $$xx$$, and random variables are denoted with uppercase letters.

Does anybody know any conditions on $$LL$$, the pdf of $$xx$$ and $$yy$$, imposed linearity and/or unbiasedness, where the estimators will coincide?

## Edit

As noted in comments, an impartiality requirement such as unbiasedness is required to render the Frequentist problem meaningful. Flat priors may also be a commonality.

Besides the general discussions provided by some of the answers, the question is really also about providing actual examples. I think an important one comes from linear regression:

• the OLS, $$ˆx=(D′D)−1D′y\mathbf{\hat{x}} = (\mathbf{D}'\mathbf{D})^{-1}\mathbf{D}'\mathbf{y}$$ is the BLUE (Gauss-Markov theorem), i.e. it minimises the frequentist MSE among linear-unbiased estimators.
• if $$(X,Y)(X,Y)$$ is Gaussian and the prior is flat, $$ˆx=(D′D)−1D′y\mathbf{\hat{x}} = (\mathbf{D}'\mathbf{D})^{-1}\mathbf{D}'\mathbf{y}$$ is the “posterior” mean minimises the Bayesian mean loss for any convex loss function.

Here, $$D\mathbf{D}$$ seems to be known as data/design matrix in the frequentist/Bayesian lingo, respectively.

The question is interesting but somewhat hopeless unless the notion of frequentist estimator is made precise. It is definitely not the one set in the question

since the answer to the minimisation is $\hat{x}(y)=x$ for all $y$‘s as pointed out in Programmer2134’s answer. The fundamental issue is that there is no single frequentist estimator for an estimation problem, without introducing supplementary constraints or classes of estimators. Without those, all Bayes estimators are also frequentist estimators.

As pointed out in the comments, unbiasedness may be such a constraint, in which case Bayes estimators are excluded. But this frequentist notion clashes with other frequentist notions such as

1. admissibility, since the James-Stein phenomenon demonstrated that unbiased estimators may be inadmissible (depending on the loss function and on the dimension of the problem);
2. invariance under reparameterisation, since unbiasedness does not keep under transforms.

Plus unbiasedness only applies to a restricted class of estimation problems. By this, I mean that the class of unbiased estimators of a certain parameter $\theta$ or of a transform $h(\theta)$ is most of the time empty.

Speaking of admissibility, another frequentist notion, there exist settings for which the only admissible estimators are Bayes estimators and conversely. This type of settings relates to the complete class theorems established by Abraham Wald in the 1950’s. (The same applies to the best invariant estimators which are Bayes under the appropriate right Haar measure.)