I was wondering how you would generate data from a Poisson regression equation in R? I’m kind of confused how to approach the problem.
So if I assume we have two predictors $X_1$ and $X_2$ which are distributed $N(0,1)$. And the intercept is 0 and both of the coefficients equal 1. Then my estimate is simply:
$$\log(Y) = 0+ 1\cdot X_1 + 1\cdot X_2$$
But once I have calculated log(Y) – how do I generate poisson counts based on that? What is the rate parameter for the Poisson distribution?
If anyone could write a brief R script that generates Poisson regression samples that would be awesome!
The poisson regression model assumes a Poisson distribution for $Y$ and uses the $\log$ link function. So, for a single explanatory variable $x$, it is assumed that $Y \sim P(\mu)$ (so that $E(Y) = V(Y) = \mu$) and that $\log(\mu) = \beta_0 + \beta_1 x$. Generating data according to that model easily follows. Here is an example which you can adapt according to your own scenario.
> #sample size > n <- 10 > #regression coefficients > beta0 <- 1 > beta1 <- 0.2 > #generate covariate values > x <- runif(n=n, min=0, max=1.5) > #compute mu's > mu <- exp(beta0 + beta1 * x) > #generate Y-values > y <- rpois(n=n, lambda=mu) > #data set > data <- data.frame(y=y, x=x) > data y x 1 4 1.2575652 2 3 0.9213477 3 3 0.8093336 4 4 0.6234518 5 4 0.8801471 6 8 1.2961688 7 2 0.1676094 8 2 1.1278965 9 1 1.1642033 10 4 0.2830910