On George Box, Galit Shmueli and the scientific method?

(This question might seem like it is better suited for the Philosophy SE. I am hoping that statisticians can clarify my misconceptions about Box’s and Shmueli’s statements, hence I am posting it here).

George Box (of ARIMA fame) said:

“All models are wrong, but some are useful.”

Galit Shmueli in her famous paper “To Explain or to Predict”, argues (and cites others who agree with her) that:

Explaining and predicting are not the same, and that some models do a good job of explaining, even though they do a poor job at predicting.

I feel that these two principles are somehow contradictory.

If a model doesn’t predict well, is it useful?

More importantly, if a model explains well (but doesn’t necessarily predict well), then it has to be true (i.e not wrong) in some way or another. So how does that mesh with Box’s “all models are wrong”?

Finally, if a model explains well, but doesn’t predict well, how is it even scientific? Most scientific demarcation criteria (verificationism, falsificstionism, etc…) imply that a scientific statement has to have predictive power, or colloquially: A theory or model is correct only if it can be empirically tested (or falsified), which means that it has to predict future outcomes.

My questions:

  • Are Box’s statement and Shmueli’s ideas indeed contradictory, or am I missing something, e.g can a model not have predictive power yet still be useful?
  • If the statements of Box and Shmueli are not contradictory, then what does it mean for a model to be wrong and not predict well, yet still have explanatory power? Put it differently: If one takes away both correctness and predictive ability, what is left of a model?

What empirical validations are possible when a model has explanatory power, but not predictive power? Shmueli mentions things like: use the AIC for explanation and the BIC for prediction, etc,…but I don’t see how that solves the problem. With predictive models, you can use the AIC, or the BIC, or the R^2, or L1 regularization, etc…but ultimately out of sample testing and performance in production is what determines the quality of the model. But for models that explain well, I don’t see how any loss function can ever truly evaluate a model. In philosophy of science, there is the concept of underdetermination which seems pertinent here: For any given data set, one can always judiciously choose some distribution (or mixture of distributions) and loss function L in such a way that they fit the data (and therefore can be claimed to explain it). Moreover, the threshold that L should be under for someone to claim that the model adequately explains the data is arbitrary (kind of like p-values, why is it p < 0.05 and not p < 0.1 or p < 0.01?).

  • Based on the above, how can one objectively validate a model that explains well, but doesn't predict well, since out of sample testing is not possible?


Let me start with the pithy quote by George Box, that "all models are wrong, but some are useful". This statement is an encapsulation of the methodological approach of "positivism", which is a philosophical approach that is highly influential in the sciences. This approach is described in detail (in the context of economic theory) in the classic methodological essay of Friedman (1966). In that essay, Friedman argues that any useful scientific theory necessarily constitutes a simplification of reality, and thus its assumptions must always depart from reality to some degree, and may even depart substantially from reality. He argues that the value of a scientific theory should not be judged by the closeness of its assumptions to reality --- instead it should be judged by its simplicity in reducing the complexity of the world to a manageable set of principles, and its accuracy in making predictions about reality, and generating new testable hypotheses about reality. Thus, Friedman argues that "all models are wrong" insofar as they all contain assumptions that simplify (and therefore depart from) reality, but that "some are useful" insofar as they give a simple framework to make useful predictions about reality.

Now, if you read Box (1976) (the paper where he first states that "all models are wrong"), you will see that he does not cite Friedman, nor does he mention methodological positivism. Nevertheless, his explanation of the scientific method and its characteristics is extremely close to that developed by Friedman. In particular, both authors stress that a scientific theory will make predictions about reality that can be tested against observed facts, and the error in the prediction can then be used as a basis for revision of the theory.

Now, on to the dichotomy discussed by Galit Shmueli in Shmueli (2001). In this paper, Shmueli compares causal explanation and prediction of observed outcomes and argues that these are distinct activities. Specifically, she argues that causal relations are based on underlying constructs that do not manifest directly in measureable outcomes, and so "measurable data are not accurate representations of their underlying constructs" (p. 293). She therefore argues that there is an aspect of statistical analysis that involves making inferences about unobservable underlying causal relations that are not manifested in measureable counterfactual differences in outcomes.

Unless I am misunderstanding something, I think it is fair to say that this idea is in tension with the positivist views of Box and Friedman, as represented in the quote by Box. The positivist viewpoint essentially says that there are no admissible metaphysical "constructs" beyond those that manifest in measureable outcomes. Positivism confines itself to consideration of observable data, and concepts built on this data; it excludes consideration of a priori metaphysical concepts. Thus, a positivist would argue that the concept of causality can only be valid to the extent that it is defined in terms of measureable outcomes in reality --- to the extent it is defined as something distinct from this (as Shmueli treats it), this would be regarded as metaphysical speculation, and would be treated as inadmissible in scientific discourse.

So I think you're right --- these two approaches are essentially in conflict. The positivist approach used by Box insists that valid scientific concepts be grounded entirely in their manifestations in reality, whereas the alternative approach used by Shmueli says that there are some "constructs" that are important scientific concepts (that we want to explain) but which cannot be perfectly represented when they are "operationalised" by relating them to measureable outcomes in reality.

Source : Link , Question Author : Skander H. , Answer Author : kjetil b halvorsen

Leave a Comment