Counting samples seem to not be Poisson distributed, need sanity check

I have an exercise where I have to use Poisson one-way classification / Regression of some data. The data I have is a set of 120 samples grouped the following labels A, B, C, D, E, and F.
For each group, there are 20 samples (or 20 repetitions) with a count value.
Now that is all good and from what I can tell is well suited for the assumption that it may fit a Poisson distribution.

However, as I understand it one of the properties of a random variable Y that follows YPo(λ)
Then it follows that E(Y)=V(Y)
But when I calculate the means (expected) and the variance for the data according to the grouping: Then I get

|          |         A |         B |         C |         D |          E |         F |
|----------+-----------+-----------+-----------+-----------+------------+-----------|
|----------+-----------+-----------+-----------+-----------+------------+-----------|
| Mean     |      4.90 |      9.45 |      8.65 |      1.45 |      18.35 |      0.80 |
| Variance | 9.8842105 | 6.4710526 | 7.3973684 | 1.3131579 | 15.6078947 | 0.5894737 |
|----------+-----------+-----------+-----------+-----------+------------+-----------|

So from what I can tell E(Y)V(Y) and just to show how I computed this using R:

( Means <- tapply(D$NumberPGrains, D$Era, mean) )
( Variances <- tapply(D$NumberPGrains, D$Era, var) )

This means, from my understanding that the data is not Poisson distributed.
So my question is: Am I wrong can this still be Poisson distributed, is there something I have missed?

And just to clarify, the exercise literally states to follow Poisson one-way classification (the title of the exercise: “Question 3 -Poisson one-way classification model”), but right now I have hard time seeing the purpose of that.

Answer

A true Poisson distribution will have its mean exactly equal to its variance. For a sampling of a Poisson distribution, however, there will be some deviation – with only 20 samples, it’s unlikely that you’d see the mean and variance of the sample be exactly equal. For the most part, you seem to have a strong correlation between the mean and variance, which is good. You could also find the confidence intervals around your parameter estimates, to take a hypothesis testing approach to determine whether your mean and variance estimates are really statistically different from one another. With a very large sample size, you’ll have very good estimates which should be very nearly equal if the data is indeed Poisson distributed, but for lower sample sizes, your estimates won’t be as good, so some numerical differences between the mean and variance are expected.

Attribution
Source : Link , Question Author : Lars Nielsen , Answer Author : Nuclear Hoagie

Leave a Comment