Permutation tests (also called a randomization test, re-randomization test, or an exact test) are very useful and come in handy when the assumption of normal distribution required by for instance,
t-testis not met and when transformation of the values by ranking of the non-parametric test like
Mann-Whitney-U-testwould lead to more information being lost. However, one and only one assumption should not be overlooked when using this kind of test is the assumption of exchangeability of the samples under the null hypothesis. It is also noteworthy that this kind of approach can also be applied when there are more than two samples like what implemented in
Can you please use some figurative language or conceptual intuition in plain English to illustrate this assumption? This would be very useful to clarify this overlooked issue among non-statisticians like me.
It would be very helpful to mention a case where applying a permutation test doesn’t hold or invalid under the same assumption.
Supppose that I have 50 subjects collected from the local clinic in my district at random. They were randomly assigned to received drug or a placebo at 1:1 ratio. They were all measured for paramerter 1
Par1at V1 (baseline), V2 (3 months later), and V3 (1 year later). All 50 subjects can be subgrouped into 2 groups based on feature A; A positive = 20 and A negative = 30. They can also be subgrouped into another 2 groups based on feature B; B positive = 15 and B negative = 35.
Now, I have values of
Par1from all subjects at all visits. Under the assumption of exchangeability, can I do comparison between levels of
Par1using permutation test if I would:
– Compare subjects with drug with those received placebo at V2?
– Compare subjects with feature A with those having feature B at V2?
– Compare subjects having feature A at V2 with those having feature A but at V3?
– By which situation this comparison would be invalid and would violate the assumption of exchangeability?
First, the non-figurative description: Exchangability means that the joint distribution is invariant to permutations of the values of each variable in the joint distribution (i.e, fXYZ(x=1,y=3,z=2)=fXYZ(x=3,y=2,z=1), etc). If this is not the case then counting permutations is not a valid way of testing the null hypothesis, as each permutation will have a different weight (probability/density). Permutation tests depend on each assignment of a given set of numerical values to your variables having the same density/probability.
A concrete example where exchangeability is absent: You have N jars, each filled with 100 numbered tickets. The first M jars have tickets with only odd numbers from 1-200 (1 ticket per number), the remaining N-M have tickets for only even numbers between 1 – 200. If you select a ticket from each jar at random, you get a joint distribution on sample results. In this case, f(X1=1,X2=2,X3=3…XN=N)≠f(X1=N,X2=N−1,X3=N−2…XN=1)
so you cannot just count permutations of the values 1 through N. In general, exchangeability fails when your sample can be stratified into sub-groups (as I have done with the jars). Exchangeabilty would be restored if, instead of taking 1 sample from N jars, you took N samples from 1 jar. Then, the joint distribution would be invariant to permutations.