How is it possible that Poisson GLM accepts non-integer numbers?

I am really stunned by the fact that the Poisson GLM accepts non-integer numbers! Look:

Data (contents of data.txt):

1   2001    0.25  1
1   2002    0.5   1
1   2003    1     1
2   2001    0.25  1
2   2002    0.5   1
2   2003    1     1

R script:

t        <- read.table("data.txt")
names(t) <- c('site', 'year', 'count', 'weight')
tm       <- glm(count ~ 0 + as.factor(site) + as.factor(year), data = t, 
                family = "quasipoisson")  # also works with family="poisson"
years    <- 2001:2003
plot(years, exp(c(0, tail(coef(tm), length(years)-1))), type = "l")

The resultant year index is as “expected”, i.e., 1-2-4 in years 2001-2003.

But how is it possible that Poisson GLM takes non-integer numbers? The Poisson distribution has always been integer-only!

Answer

Of course you are correct that the Poisson distribution technically is defined only for integers. However, statistical modeling is the art of good approximations (“all models are wrong“), and there are times when it makes sense to treat non-integer data as though it were [approximately] Poisson.

For example, if you send out two observers to record the same count data, it may happen that the two observers do not always agree on the count — one might say that something happened 3 times while the other said it happened 4 times. It is nice then to have the option to use 3.5 when fitting your Poisson coefficients, instead of having to choose between 3 and 4.

Computationally, the factorial in the Poisson could make it seem difficult to work with non-integers, but a continuous generalization of the factorial exists. Moreover, performing maximum likelihood estimation for the Poisson does not even involve the factorial function, once you simplify the expression.

Attribution
Source : Link , Question Author : Tomas , Answer Author : Community

Leave a Comment