We were asked by a reviewer to provide p-values as to better understand the model estimates in our bayesian multilevel model. The model is a typical model of multiple observations per participant in an experiment. We estimated the model with Stan, so we can easily compute additional posterior statistics. Currently, we are reporting (visually and in tables) the mean estimate and the 0.025 and 0.975 quantiles.
My response so far would include:
- P-values are inconsistent with bayesian models, i.e. $P(X|\theta) \neq P(\theta|X).$
- Based on the posterior, we can calculate the probability of parameters being larger (smaller) than 0. This looks a bit like a traditional p-value.
My question is whether this is a response that can satisfy a reviewer or will it only cause more confusion?
Update 10-Oct: We rewrote the paper with the advice in the answer in mind. The paper is accepted so I will reiterate my earlier comment that this was really helpful advice!
First, a quick clarification: Although the likelihood is indeed not the posterior, p-values are not so much inconsistent with Bayesian inference as usually just a different thing, for all the reasons that confidence intervals may or may not line up with credible intervals. (Although not necessarily an entirely different thing, as shown by posterior predictive checking, which really does involve p-values.)
However I’m guessing that this level of sophistication is not what the reviewer has in mind. I’d guess they just ‘know’ that statistical models are meant to have p-values, so they’ve asked for them. So the question remains: how to respond?
When ‘reviewer wants an X’ I have found it useful to ask myself two related questions:
Motivation: What do they want X to do for them?
Rational reconstruction: What would be most similar-sounding sensible thing they could have asked for instead of X if they wanted to do that?
Then give them that instead.
The advantage of an ignorant reviewer (who may nevertheless be smart and right about the paper) is that they seldom have a clear idea of what they mean when they ask for X. This means that if you reconstruct them asking a better question, they’ll be content to see you answer it instead.
In your case, it’s quite possible that the reviewer wants a parallel frequentist analysis, though I doubt it. What I think you want to work with is the reviewer’s hint that they want p-values to ‘to better understand the model’. Your job, I think, is to parse this in a way that makes the reviewer sound wise. Presumably there were a few following sentences noting what was unclear from the paper. Perhaps there were some effects of interest to the reviewer that could not be reconstructed from your parameter marginals, or some quantities that would illuminate what the model would say about cases of interest to them, or a lack single number summaries…
If you can identify these concerns then you can wrap up your response in the following forms (original request in square brackets):
“the reviewer [demands a p-value for an interaction term] was concerned that it was unclear from our presentation how A varied with B, so in Figure 2 we show…” or “the reviewer wondered [whether we could reject the hypothesis that the effect of A is zero] about the direction of the effect of A. Table 3 shows that this model gives a 99% probability that this is negative” or “the reviewer wonders [whether our model is significantly better fitting than a model containing only A] how our model compared to one containing only A. We address this question by comparing it to … by using DIC / computing a Bayes Factor / showing our inferences about A are robust to the inclusion of B” etc.
In each case there is a close translation of the original request and an answer.
Caveats: this strategy seems to work best when the reviewer is a subject matter expert with a relatively weak understanding of statistics. It does not work with the self-identified statistically sophisticated reviewer who actually does want an X because they like Xs or read about them somewhere recently. I have no suggestions for the latter.
Finally, I would strongly recommend not saying anything even faintly religious about Bayes being a different paradigm and the reviewers questions making no sense within it. Even if this is true, it makes everybody grumpy for no real gain.