# How is a Poisson rate regression equal to a Poisson regression with corresponding offset term?

I do not understand the role of weights in “weighted Poisson regression”. What exactly is being weighted? Is it the contribution of the observation to the log-likelihood of the model, or something else?

In the following two popular threads,

Where does the offset go in Poisson/negative binomial regression?

When to use an offset in a Poisson regression?

commentators establish the equivalence between Poisson regression with explicit offset $t_i$ (for exposure time, for example) in equation:

and weighted Poisson regression with weights $t_i$ (at least in R):

By equivalent, one of the threads demonstrates with an example that the estimated coefficients are the same.

However, I don’t understand what the weighting in the second regression means? What are the objective functions being optimised in both cases? In the first one is it the normal Poisson log-likelihood: $-\lambda + k \ln(\lambda) - \ln(k!)$?

This also confused me. I thought, “what is the point of explicitly including an offset instead of just pretending that the response divided by the offset / exposure is the $y$ value?”.

You actually get two different loss functions if you do so.

# The correct way (use an exposure/offset $s_i$)

Model $\log \lambda_i = \log s_i + \theta^T x$ so that $\lambda_i = s_i e^{\theta^Tx}$. This makes complete sense: the exposure $s_i$ just multiplies the $\hat{\lambda_i}=e^{\hat{\theta}^Tx}$ in a Poisson regression model without different exposures.

We model the random variable $Y$, a response to $x_i$, with a Poisson distribution with parameter $\lambda_i$.

Then the likelihood for $N$ data points is:

The log likelihood $\ell$, keeping only terms that depend on $\theta$ since others will drop out after differentiation:

# The incorrect way (using $y_i/s_i$ as the y-values)

Now we still model:

The difference is that now we assume $y_i/s_i$ has a Poisson distribution. This is essentially what makes the model incorrect. It violates the assumption that $y_i$ has a Poisson distribution. Now you are modeling the rate as having a Poisson distribution. So the likelihood is now:

[Awkward to have $y_i/s_i$ in the factorial term but it drops out anyway after differentiation of the log likelihood so let’s carry on.]

The log likelihood $\hat{\ell}$, keeping only terms that depend on $\theta$ since others will drop out after differentiation:

# Conclusion

$\ell$ and $\hat{\ell}$ look strikingly similar, and you might think they are the same, but they are not (you can’t just divide by $s_i$ because it is different for every term!)

However, if we consider a weighted Poisson regression when we model $y_i/s_i$ as distributed Poissonian (is that a word?), each data point in the log likelihood gets a weight of $s_i$, then:

is equivalent to $\ell$!