Poisson regression with large data: is it wrong to change the unit of measurement?

Due to the factorial in a poisson distribution, it becomes unpractical to estimate poisson models (for example, using maximum likelihood) when the observations are large. So, for example, if I am trying to estimate a model to explain the number of suicides in a given year (only annual data are available), and say, there are thousands of suicides every year, is it wrong to express suicides in hundreds, so that 2998 would be 29.98 ~= 30? In other words, is it wrong to change the unit of measurement to make the data manageable?


When you’re dealing with a Poisson distribution with large values of \lambda (its parameter), it is common to use a normal approximation to the Poisson distribution.

As this site mentions, it’s all right to use the normal approximation when \lambda gets over 20, and the approximation improves as \lambda gets even higher.

The Poisson distribution is defined only over the state space consisting of the non-negative integers, so rescaling and rounding is going to introduce odd things into your data.

Using the normal approx. for large Poisson statistics is VERY common.

Source : Link , Question Author : Vivi , Answer Author : Baltimark

Leave a Comment