# Is the Wilcoxon rank-sum test a nonparametric alternative to the two sample t-test? Null hypotheses are different

https://www.stat.auckland.ac.nz/~wild/ChanceEnc/Ch10.wilcoxon.pdf
says
“The Wilcoxon rank-sum test is a nonparametric alternative to the two sample t-test which is based solely on the order in which the observations from the two samples fall.”

However, the null hypothesis for the two sample t-test is H0: myu x = myu y and
that for the Wilcoxon rank-sum test is P(X >Y) + 1/2 *P(X=Y) =0.5.
That means the two null hypotheses are different.
Why can the Wilcoxon rank sum test be an alternative to the two-sample t-test?

https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1305291

Is the Wilcoxon rank-sum test a nonparametric alternative to the two sample t-test?

Yes and no. (Go not to the elves for counsel…)

Speaking broadly, any given test statistic has some power curve in relation to a given sequence of alternatives under some set of assumptions (sufficiently specified to have a unqiue value for power under any element in the sequence); a test with reasonable power against some set of alternatives will also tend to have power against other alternatives that are in some sense similar (e.g. if I make one set of values typically larger than another, I will also typically make the difference in means and 75th percentiles larger while I am doing it).

The questions we should tend to focus on are along the lines of “what alternatives do I want to test against, what else am I prepared to assume, and what are the properties of various possible test statistics in those cases?”

Unfortunately many people have a tendency to adapt their hypothesis to a chosen test statistic rather than the other way around.

You’re right that the most general form of the hypotheses are different.

However, if you add some additional assumptions to the rank sum test, you could regard it that way; for example, if you assume that the alternative is a location shift (that the distribution is the same even under the alternative, apart from being shifted up or down), then you could see it as a kind of nonparametric equivalent.

In effect, if the only issue with the ordinary two-sample t-test is the populations are not necessarily normally distributed, but otherwise everything else is as supposed, then you might treat the Wilcoxon rank sum test as an alternative version of a t-test that doesn’t assume normality. For example, the population location shift will correspond to a difference in population means if the population distribution has a finite mean.

It’s sensitive to that kind of alternative, so it will be a good test for that situation (even if the population distributions are exactly normal).

However it’s also sensitive to other alternatives (i.e. if your location-shift-alternative assumption is wrong, you may be rejecting because of something other than a location shift).

On the other hand, the t-test itself is also sensitive to other differences than a pure shift in the mean, so one might say the much the same thing about it; if you assume a pure location shift but in actual fact it’s (say) a scale shift you, it will reject and you might tend to misinterpret the outcome.

It’s always important to think carefully about what sort of alternatives you want to test for, exactly, and what you’re prepared to assume about the populations under those alternatives.

There’s a much more straightforward nonparametric equivalent, though, which is a permutation test based on a t-statistic*.

If the assumptions of the t-test hold, this test will typically work very similarly – especially in large samples.

If the assumptions of the t-test don’t hold, but the assumptions of the permutation test do, then it will have the advertized significance level (though it may be less powerful than the rank sum test on shift alternatives if the populations are heavy-tailed).

*(if you prefer, you could do a permutation test based on a difference in sample means).