How do you verify test assumptions in real world cases, without testing them

We know that, formally, the assumptions of a test cannot be tested, because if we choose what test to use based on the results of the test, the resulting composite test has unknown properties (Type I and II error rates). I think this is one of the reasons why “Six Sigma” kind of approaches to statistics (use a decision tree based on test results to choose which test to use) get a bad rap among professional statisticians.

However, with real world data, we often get samples for which classical assumptions may not hold, and thus we need to check in a way or another. So what do you actually do in your job/research? Perform an informal check, for example have a look at the distribution of data, and use a t-test when the empirical distribution doesn’t seem too skewed? This is what I see being done most of times. However, as long as we take a decision based on the result of this “informal test”, we still affect the test properties, and of course if we don’t use the check to take a decision, then the check is useless and we shouldn’t waste precious time doing it. Of course, you could answer me that formal test properties are overrated, and that in practice we do not need to be religious about that. This is why I’m interested in what you do in practice, not just from a theoretical background.

Another approach would be to always use the test with less assumptions. Usually, I’ve seen this approach being framed as preferring nonparametric tests over parametric tests, since the former don’t assume that the test statistics comes from a family of distributions indexed by a vector of parameters, thus should be more robust (less assumptions). Is this true in general? With this approach, don’t we risk using underpowered tests in some cases? I’m not sure. Is there a useful (possibly simple) reference for applied statistics, which lists a list of tests/models to use, as better alternatives to classical tests (t-test, Chi-square, etc.), and when to use them?


What I have seen done most often (and would tend to do myself) is to look at several sets of historical data from the same area for the same variables and use that as a basis to decide what is appropriate. When doing that one of course should keep in mind that mild deviations from e.g. normality in the regression residuals are generally not too much of an issue given sufficiently large sample sizes in the planned application. By looking at independent data, one avoids the issue of messing up test properties like type I error control (which are very important in some areas like confirmatory clinical trial for regulatory purposes). The reason for (when appropriate) using parametric approaches is, as you say, efficiency, as well as the ability to easily adjust for predictive covariates like a pre-experiment assessment of your main variable and to get effect size estimates that are easier to interpret than, say, the Hedges-Lehmann estimate.

Source : Link , Question Author : DeltaIV , Answer Author : Björn

Leave a Comment