I understand the procedure and what it controls. So what’s the formula for the adjusted p-value in the BH procedure for multiple comparisons?
Just now I realized the original BH didn’t produce adjusted p-values, only adjusted the (non) rejection condition: https://www.jstor.org/stable/2346101. Gordon Smyth introduced adjusted BH p-values in 2002 anyways, so the question still applies. It’s implemented in R as
The famous seminal Benjamini & Hochberg (1995) paper described the procedure for accepting/rejecting hypotheses based on adjusting the alpha levels. This procedure has a straightforward equivalent reformulation in terms of adjusted p-values, but it was not discussed in the original paper. According to Gordon Smyth, he introduced adjusted p-values in 2002 when implementing
p.adjust in R. Unfortunately, there is no corresponding citation, so it has always been unclear to me what one should cite if one uses BH-adjusted p-values.
Turns out, the procedure is described in the Benjamini, Heller, Yekutieli (2009):
An alternative way of presenting the results of this procedure is by presenting the adjusted p-values. The BH-adjusted p-values are defined as pBH(i)=min
This formula looks more complicated than it really is. It says:
- First, order all p-values from small to large. Then multiply each p-value by the total number of tests m and divide by its rank order.
- Second, make sure that the resulting sequence is non-decreasing: if it ever starts decreasing, make the preceding p-value equal to the subsequent (repeatedly, until the whole sequence becomes non-decreasing).
- If any p-value ends up larger than 1, make it equal to 1.
This is a straightforward reformulation of the original BH procedure from 1995. There might exist an earlier paper that explicitly introduced the concept of BH-adjusted p-values, but I am not aware of any.
Update. @Zenit found that Yekutieli & Benjamini (1999) described the same thing already back in 1999: