I heard that under the null hypothesis the p-value distribution should be uniform. However, simulations of binomial test in MATLAB return very different-from-uniform distributions with mean larger than 0.5 (0.518 in this case):
coin = [0 1]; success_vec = nan(20000,1); for i = 1:20000 success = 0; for j = 1:200 success = success + coin(randperm(2,1)); end success_vec(i) = success; end p_vec = binocdf(success_vec,200,0.5); hist(p_vec);
Trying to change the way in which I generate random numbers didn’t help.
I would really appreciate any explanation here.
The result that $p$ values have a uniform distribution under $H_0$ holds for continuously distributed test statistics – at least for point nulls, as you have here.
As James Stanley mentions in comments the distribution of the test statistic is discrete, so that result doesn’t apply. You may have no errors at all in your code (though I wouldn’t display a discrete distribution with a histogram, I’d lean toward displaying the cdf or the pmf, or better, both).
While not actually uniform, each jump in the cdf of the p-value takes it to the line $F(x)=x$ (I don’t know a name for this, but it ought to have a name, perhaps something like ‘quasi-uniform’):
It’s quite possible to compute this distribution exactly, rather than simulate – but I’ve followed your lead and done a simulation (though a larger one than you have).
Such a distribution needn’t have mean 0.5, though as the $n$ in the binomial increases the step cdf will approach the line more closely, and the mean will approach 0.5.
One implication of the discreteness of the p-values is that only certain significance levels are achievable — the ones corresponding to the step-heights in the actual population cdf of p-values under the null. So for example you can have an $\alpha$ near 0.056 or one near 0.04, but not anything closer to 0.05.