Normality of residuals vs sample data; what about t-tests?

An addition to the common confusion about normality testing in inferential statistics for general linear models:

I understand the assumption of normality refers to the residuals in ANOVA and regression models, but what about t-tests?

Calculation of the t-statistic uses standard deviation as a measure of dispersion, rather than sums of squares for the F-statistic in ANOVA and regression. If I understand correctly, it is because of this that there is an assumption that the samples are normally distributed (i.e. the sample standard deviation provides an accurate estimate of the population standard deviation). Is this right?

What do you do if the statistical population of the dependent variable is not distributed normally (thus samples are unlikely to be normally distributed)? Is this where transformations fit in?

Answer

It’s the same for t-tests. There’s no need for the data-set as a whole to be normally distributed (& if there is a difference in means between the groups it won’t be), only the residuals. Of course, for a t-test, saying the residuals are normally distributed is the same as saying each group is normally distributed.

The distribution of the independent variables (in the case of t-tests it’s the group label) is always irrelevant in ANOVA, regression, & the like: you’re interested in the distribution of the response variable conditional on given values of the independent variables.

Attribution
Source : Link , Question Author : DeanP , Answer Author : Scortchi – Reinstate Monica

Leave a Comment