Objective vs. subjective Bayesian paradigms

What is the difference between objective and subjective Bayesian paradigms?
What objects or procedures do they define or interpret differently?
Is there any difference in their choice of methods?

Answer

This is a very confusing topic, mostly owing to the fact that there are two different ways in which the concept of “subjectivism” is commonly used in these discussions.^\dagger It is made even more confusing by the fact that there is a class of “subjectivism” that is rooted in prior elicitation from experts, and this particular variation has to be fit into the philosophical categorisation of paradigms carefully. I will try to bring some clarity to this issue by setting out some different ways in which “subjectivism” is often interpreted, and then setting out broad areas of agreement among Bayesians, and areas where there is a divergence in philosophical and practical approaches. I expect there will be others who will disagree with my own views on this, but I hope this gives a good starting point for clear discussion.


Weak subjectivism: In this interpretation, the term “subjective” is used in its weaker sense, meaning merely that probability encapsulates the rational beliefs of a subject. (Some people, such as myself, prefer to use the term “epistemic” for this concept, since it does not actually require subjectivity in the stronger sense.)

Strong subjectivism: In this interpretation, the term “subjective” is used in its stronger sense, meaning that weak subjectivism holds, and furthermore, the subject’s belief lacks any outside “objective” justification (i.e., two or more different subjects could all hold different beliefs, and none would be considered more or less wrong than the others).


In Bayesian analysis it is generally the case that the chosen sampling distribution has an objective justification rooted in some understanding of the sampling mechanism. However, there is rarely any available information pertaining to the parameter, other than in the sample data. This gives rise to three broad paradigms in Bayesian statistics, which correspond to different ways of determining the prior distribution.

Subjective Bayesian paradigm: This paradigm agrees with weak subjectivism, and further holds that any set of probabilistic beliefs is equally valid. So long as subjects use Bayesian updating for new data, it is legitimate to use any prior. Under this paradigm, the prior does not require any objective justification. In this paradigm there is a focus on disclosing the prior used, and then showing how this updates with new data. It is common in this method to include sensitivity analysis showing posterior beliefs under a range of prior beliefs.

Objective Bayesian paradigm: This paradigm also agrees with weak subjectivism, but prefers to additionally constrain prior beliefs (before inclusion of any data) so that they are objectively “non-informative” about the parameter. In this paradigm the prior is supposed to accurately reflect the lack of available information pertaining to the parameter, outside of the data. This usually entails adopting some theory for how to set the prior (e.g., Jeffrey’s, Jaynes, Bernardo reference priors, etc.) This paradigm holds that a set of probabilistic beliefs is to be preferred if it is based on a prior belief that is objectively determined and uninformative about the parameters in the problem of interest. It agrees that any set of probabilistic beliefs is consistent with the rationality criteria underlying Bayesian analysis, but views beliefs based on “bad” priors (too informative about the unknown parameter) as being worse than those based on “good” priors. In this paradigm the prior is chosen from some uninformative class, and then updated with new data to yield an objective answer to the problem.

Expert-prior Bayesian paradigm: This method is often viewed as part of the subjective paradigm, and is not usually identified separately, but I consider it a separate paradigm because it has elements of each view. This paradigm agrees with weak subjectivism, but like the objective Bayesian paradigm, it does not view all priors as equally valid. This paradigm treats present “priors” as posteriors from previous life experience, and so regards the prior beliefs of subject-matter experts as being superior to the prior beliefs of non-experts. It also recognises that those beliefs are probably based on data that has not been systematically recorded, and is not based on systematic use of probability theory, so it is not possible to decompose these existing expert priors into an original non-informative prior and the data that this expert observed. (And indeed, in the absence of systematic use of probability theory, the present expert “prior” is probably not even consistent with Bayesian updating.) In this paradigm the expert’s present “subjective” opinion is treated as being a valuable encapsulation of subject-matter knowledge, which is treated as a primitive prior. In this paradigm the analyst seeks to elicit the expert prior through some tests of prior belief, and then the prior is formulated as the best fit to that expert belief (taking care to ensure that the expert belief has not been polluted by knowledge of the present data). The “subjective” belief of the expert is thus treated as an “objective” encapsulation of subject-matter knowledge from previous data.

Differences in method: In terms of method, the objective Bayesian paradigm differs from the subjective paradigm insofar as the former constrains the allowable priors (either to a unique prior or a very small class of similar priors), whereas the latter does not constraint the allowable priors. In the objective Bayesian approach the prior is constrained by theories of representing a “non-informative” prior. The expert-prior paradigm takes a different approach and instead identifies one or more people that are experts, and elicits their prior beliefs.


Once we understand these different sense of the different paradigms in Bayesian statistics, we can set out some areas of broad agreement, and areas where there is disagreement. Actually, despite differences in method, there is more agreement on the underlying theories than is usually apprectiated.

Broad agreement on weak subjectivism: There exists a large literature in Bayesian statistics showing that the “axioms” of probability can be derived from preliminary desiderata relating to rational decision-making. This includes arguments pertaining to dynamic belief consistency (see e.g., Epstein and Le Breton 1993), arguments appealing to the Dutch book theorem (see e.g., Lehmann 1955, Hajek 2009). Bayesians of all these paradigms broadly agree that probability should be interpreted epistemically, as referring to the beliefs of a subject, constrained by the rationality constraints inherent in the axioms of probability. We agree that one should use the rules of probability to constrain one’s beliefs about uncertainty to be rational. This implies that beliefs about uncertainty require Bayesian updating in the face of new data, but it does not impose any further constraint (i.e., without more, it does not say that any prior is better than any other prior). All three of the above paradigms agree on this.

There is agreement that posteriors tend to converge with more data: There are a number of consistency theorems in Bayesian statistics which show that if you have two people with the same likelihood function, but different priors, then their posterior beliefs will converge as you get more and more data.^{\dagger \dagger} This means that with a large amount of data, the prior does not matter very much. These are undeniable theorems of probability, and all three of the above paradigms agree on this. For this reason, it is generally recognised that with large amounts of data, any of the three paradigms is likely to give similar results. Consequently, the differences in the paradigms are most important when we only have a small amount of data.

There is broad agreement that there are roughly “objective” rules for priors that are available if you want to use them: There exists a large body of literature in Bayesian statistics showing how you can develop “non-informative” priors which are roughly determined by the sampling problem, and roughly encapsulate the absence of much knowledge about the parameter in question. I say “roughly” because there are several competing theories here that sometimes correspond but sometimes differ slightly (e.g., Jeffrey’s, Jaynes, reference priors, Walley classes of imprecise priors, etc.), and there are also some tricky paradoxes that can occur. The most difficult issue here is that it is difficult to make an “uninformative” prior for a continuous parameter that can be subjected to nonlinear transformations (since “uninformativity” should ideally be invariant to transformations). Again, these are theorems of probability, and all the paradigms agree with their content. Objective Bayesians tend to view this theory as being sufficiently good that it gives superior priors, whereas subjective Bayesians and expert-prior Bayesians tend to view the theory as being insufficient to establish superiority of these priors. In other words, there is broad agreement that these objective rules exist, and can be used, but there is disagreement over how good they are.

There is disagreement over the importance of having a single answer: Objective Bayesians are motivated by the preference that a statistical problem with fixed data and a fixed likelihood function should lead to a uniquely determined posterior belief (or at least a small number of allowable posterior beliefs that vary very little). This preference is generally part of a broader preference for having scientific procedures that yield a unique answer when applied to fixed sets of objective conditions. Contrarily, both subjective Bayesians and expert-prior Bayesians believe that this is not especially important, and they generally believe that this focus on a uniquely determined posterior is actually misleading.

There is broad agreement that the public are not well-acquainted with Bayesian posteriors: All paradigms agree that the general public are not well-acquainted with the basic mechanics of how Bayesian analysis transitions from a prior to a posterior. Objective Bayesians sometimes worry that giving more than one allowable answer for the posterior will be confusing to people. Subjective Bayesians worry that failing to give more than one allowable answer for the posterior is misleading to people.


^\dagger It is worth noting that the confusion over “subjectivism” here stems from a particular instance of the general false dichotomy in epistemology between “subjectivism” and “intrinsicism” (see e.g., Piekoff). In attempts to interpret probability many users have made the error of believing that any rejection of aleatory theories of probability necessarily lead to interpretations that are “subjective” in the stronger sense specified here. To understand probability interpretations correctly, it is a good idea to understand the general problems with the subjectivism-intrinsicism dichotomy, and therefore recognise that objective epistemic interpretations exist.

^{\dagger \dagger} There are some broad regularity conditions required for this result (e.g., both subjects have a prior with support including the true parameter value) but it applies very broadly.

Attribution
Source : Link , Question Author : Richard Hardy , Answer Author : Ben

Leave a Comment