# My distribution is normal; Kolmogorov-Smirnov test doesn’t agree

I have a problem with the normality of some data I have:
I’ve done a Kolmogorov test which says it isn’t normal with p=.0000,
I don’t understand: the skewness of my distribution =-.497, and the kurtosis =-0,024

Here is the plot of my distribution which looks very much normal …

(I have three scores, and each one of this scores isn’t normal with a significant p-value for the Kolmogorov test … I really don’t understand )  1. You have no basis to assert your data are normal. Even if your skewness and excess kurtosis both were exactly 0, that doesn’t imply your data are normal. While skewness and kurtosis far from the expected values indicate non-normality, the converse doesn’t hold. There are non-normal distributions that have the same skewness and kurtosis as the normal. An example is discussed here, the density of which is reproduced below: As you see, it’s distinctly bimodal. In this case, the distribution is symmetric, so as long as sufficient moments exist, the typical skewness measure will be 0 (indeed all the usual measures will be). For the kurtosis, the contribution to 4th moments from the region close to the mean will tend to make the kurtosis smaller, but the tail is relatively heavy, which tends to make it larger. If you choose just right, the kurtosis comes out with the same value as for the normal.

2. Your sample skewness is actually around -0.5, which is suggestive of mild left-skewness. Your histogram and Q-Q plot both indicate the same – a mildly left-skew distribution. (Such mild skewness is unlikely to be a problem for most of the common normal-theory procedures.)

3. You’re looking at several different indicators of non-normality which you shouldn’t expect to agree a priori, since they consider different aspects of the distribution; with smallish mildly non-normal samples, they’ll frequently disagree.

Now for the big question: *Why are you testing for normality?*

I’m not really sure , I though I should before doing an ANOVA

There are a number of points to be made here.

i. Normality is an assumption of ANOVA if you’re using it for inference (such as hypothesis testing), but it’s not especially sensitive to non-normality in larger samples – mild non-normality is of little consequence and as sample sizes increase the distribution may become more non-normal and the test may be only a little affected.

ii. You appear to be testing normality of the response (the DV). The (unconditional) distribution of DV itself is not assumed to be normal in ANOVA. You check the residuals to assess the reasonableness of the assumption about the conditional distribution (that is, its the error term in the model that’s assumed normal) – i.e. you don’t seem to be looking at the right thing. Indeed, because the check is done on residuals, you do it after model fitting, rather than before.

iii. Formal testing can be next to useless. The question of interest here is ‘how badly is the degree of non-normality affecting my inference?’, which the hypothesis test really doesn’t respond to. As the sample size gets larger, the test becomes more and more able to detect trivial differences from normality, while the effect on the significance level in the ANOVA becomes smaller and smaller. That is, if your sample size is reasonably large, the test of normality is mostly telling you you have a large sample size, which means you may not have much to worry about. At least with a Q-Q plot you have a visual assessment of how non-normal it is.

iv. at reasonable sample sizes, other assumptions – like equality of variance and independence – generally matter much more than mild non-normality. Worry about the other assumptions first … but again, formal testing isn’t answering the right question

v. choosing whether you do an ANOVA or some other test based on the outcome of a hypothesis test tends to have worse properties than simply deciding to act as if the assumption doesn’t hold. (There are a variety of methods that are suitable for one-way ANOVA-like analyses on data that isn’t assumed to be normal that you can use whenever you don’t think you have reason to assume normality. Some have very good power at the normal, and with decent software there’s no reason to avoid them.)

[I believe I had a reference for this last point but I can’t locate it right now; if I find it I’ll try to come back and put it in]