# Are over-dispersion tests in GLMs actually *useful*?

The phenomenon of ‘over-dispersion’ in a GLM arises whenever we use a model that restricts the variance of the response variable, and the data exhibits greater variance than the model restriction allows. This occurs commonly when modelling count data using a Poisson GLM, and it can be diagnosed by well-known tests. If tests show that there is statistically significant evidence of over-dispersion then we usually generalise the model by using a broader family of distributions that free the variance parameter from the restriction occurring under the original model. In the case of a Poisson GLM it is common to generalise either to a negative-binomial or quasi-Poisson GLM.

This situation is pregnant with an obvious objection. Why start with a Poisson GLM at all? One can start directly with the broader distributional forms, which have a (relatively) free variance parameter, and allow the variance parameter to be fit to the data, ignoring over-dispersion tests completely. In other situations when we are doing data analysis we almost always use distributional forms that allow freedom of at least the first two-moments, so why make an exception here?

My Question: Is there any good reason to start with a distribution that fixes the variance (e.g., the Poisson distribution) and then perform an over-dispersion test? How does this procedure compare with skipping this exercise completely and going straight to the more general models (e.g., negative-binomial, quasi-Poisson, etc.)? In other words, why not always use a distribution with a free variance parameter?

(2.5) Proper distribution. While the negative binomial regression comes from a valid statistical distribution, it’s my understanding that the Quasi-Poisson does not. That means you can’t really simulate count data if you believe $$Var[y]=αE[y]Var[y] = \alpha E[y]$$ for $$α≠1\alpha \neq 1$$. That might be annoying for some use cases. Likewise, you can’t use probabilities to test for outliers, etc.