I do not understand the role of weights in “weighted Poisson regression”. What exactly is being weighted? Is it the contribution of the observation to the log-likelihood of the model, or something else?
In the following two popular threads,
commentators establish the equivalence between Poisson regression with explicit offset ti (for exposure time, for example) in equation:
and weighted Poisson regression with weights ti (at least in R):
By equivalent, one of the threads demonstrates with an example that the estimated coefficients are the same.
However, I don’t understand what the weighting in the second regression means? What are the objective functions being optimised in both cases? In the first one is it the normal Poisson log-likelihood: −λ+kln(λ)−ln(k!)?
This also confused me. I thought, “what is the point of explicitly including an offset instead of just pretending that the response divided by the offset / exposure is the y value?”.
You actually get two different loss functions if you do so.
The correct way (use an exposure/offset si)
Model logλi=logsi+θTx so that λi=sieθTx. This makes complete sense: the exposure si just multiplies the ^λi=eˆθTx in a Poisson regression model without different exposures.
We model the random variable Y, a response to xi, with a Poisson distribution with parameter λi.
Then the likelihood for N data points is:
The log likelihood ℓ, keeping only terms that depend on θ since others will drop out after differentiation:
The incorrect way (using yi/si as the y-values)
Now we still model:
The difference is that now we assume yi/si has a Poisson distribution. This is essentially what makes the model incorrect. It violates the assumption that yi has a Poisson distribution. Now you are modeling the rate as having a Poisson distribution. So the likelihood is now:
[Awkward to have yi/si in the factorial term but it drops out anyway after differentiation of the log likelihood so let’s carry on.]
The log likelihood ˆℓ, keeping only terms that depend on θ since others will drop out after differentiation:
ℓ and ˆℓ look strikingly similar, and you might think they are the same, but they are not (you can’t just divide by si because it is different for every term!)
However, if we consider a weighted Poisson regression when we model yi/si as distributed Poissonian (is that a word?), each data point in the log likelihood gets a weight of si, then:
is equivalent to ℓ!