How to choose significance level for a large data set?

I am working with a data set having N around 200,000. In regressions, I am seeing very small significance values << 0.001 associated with very small effect sizes, e.g. r=0.028. What I’d like to know is, is there a principled way of deciding an appropriate significance threshold in relation to the sample size? Are there any other important considerations about interpreting effect size with such a large sample?


In The insignificance of significance testing, Johnson (1999) noted that p-values are arbitrary, in that you can make them as small as you wish by gathering enough data, assuming the null hypothesis is false, which it almost always is. In the real world, there are unlikely to be semi-partial correlations that are exactly zero, which is the null hypothesis in testing significance of a regression coefficient. P-value significance cutoffs are even more arbitrary. The value of .05 as the cutoff between significance and nonsignificance is used by convention, not on principle. So the answer to your first question is no, there is no principled way to decide on an appropriate significance threshold.

So what can you do, given your large data set? It depends on your reason(s) for exploring the statistical significance of your regression coefficients. Are you trying to model a complex multi-factorial system and develop a useful theory that reasonably fits or predicts reality? Then maybe you could think about developing a more elaborate model and taking a modeling perspective on it, as described in Rodgers (2010), The Epistemology of Mathematical And Statistical Modeling. One advantage of having a lot of data is being able to explore very rich models, ones with multiple levels and interesting interactions (assuming you have the variables to do so).

If, on the other hand, you want to make some judgement as to whether to treat a particular coefficient as statistically significant or not, you might want to take Good’s (1982) suggestion as summarized in Woolley (2003): Calculate the q-value as p\cdot\sqrt{(n/100)} which standardizes p-values to a sample size of 100. A p-value of exactly .001 converts to a p-value of .045 — statistically significant still.

So if it’s significant using some arbitrary threshold or another, what of it? If this is an observational study you have a lot more work to justify that it’s actually meaningful in the way you think and not just a spurious relationship that shows up because you have misspecified your model. Note that a small effect is not so clinically interesting if it represents pre-existing differences across people selecting into different levels of treatment rather than a treatment effect.

You do need to consider whether the relationship you’re seeing is practically significant, as commenters have noted. Converting the figures you quote from r to r^2 for variance explained (r is correlation, square it to get variance explained) gives just 3 and 6% variance explained, respectively, which doesn’t seem like much.

Source : Link , Question Author : ted.strauss , Answer Author : rbrisk

Leave a Comment