# How to analyze longitudinal count data: accounting for temporal autocorrelation in GLMM?

Hello statistical gurus and R programming wizards,

I am interested in modeling animal captures as a function of environmental conditions and day of the year. As part of another study, I have counts of captures on ~160 days over three years. On each of these days I have temperature, rainfall, windspeed, relative humidity, etc. Because the data was collected repeatedly from the same 5 plots, I use plot as a random effect.

My understanding is that nlme can easily account for temporal autocorrelation in the residuals but doesn’t handle non-Gaussian link functions like lme4 (which can’t handle the autocorrelation?). Currently, I think it might work to use the nlme package in R on log(count). So my solution right now would be to run something like:

``````m1 <- lme(lcount ~ AirT + I(AirT^2) + RainAmt24 + I(RainAmt24^2) + RHpct + windspeed +
sin(2*pi/360*DOY) + cos(2*pi/360*DOY), random = ~1|plot, correlation =
corARMA(p = 1, q = 1, form = ~DOY|plot), data = Data)
``````

where DOY = Day of the Year. There may be more interactions in the final model, but this is my general idea. I could also potentially try to model the variance structure further with something like

``````weights = v1Pow
``````

I’m not sure if there is a better way to do with with a Poisson mixed model regression or anything? I just found mathematical discussion in Chapter 4 of “Regression Models for Time Series Analysis” by Kedem and Fokianos. It was a bit beyond me at the moment, especially in application (coding it in R). I also saw a MCMC solution in Zuur et al. Mixed Effects Models book (Chp 23) in the BUGS language (using winBUGS or JAG). Is that my best option? Is there an easy MCMC package in R that would handle this? I’m not really familiar with GAMM or GEE techniques but would be willing to explore these possibilities if people thought they’d provide better insight. My main objective is to create a model to predict animal captures given environmental conditions. Secondarily, I would like to explain what the animals a responding to in terms of their activity.

Any thoughts on the best way to proceed (philosophically), how to code this in R, or in BUGS would be appreciated. I’m fairly new to R and BUGS (winBUGS) but am learning. This is also the first time I’ve ever tried to address temporal autocorrelation.

Thanks,
Dan

Log transforming your response is an option although not ideal. A GLM framework is generally preferred. If you are not familiar with GLMs then start by reviewing them prior to looking at mixed model extensions. For count data Poisson or Negative Binomial distributional assumptions will be likely suitable. Negative Binomial is indicated if the variance is higher than the mean indicating over-dispersion (https://en.wikipedia.org/wiki/Overdispersion). The interpretation of the parameter estimates is equivalent for the two.

Several options exist in R with lme4 being most commonly cited in my experience.

``````#glmer
library(lme4)
glmer(count ~ AirT + I(AirT^2) + RainAmt24 + I(RainAmt24^2) + RHpct + windspeed + sin(2*pi/360*DOY) + cos(2*pi/360*DOY) + (1|plot), family=poisson, data = Data)
# use glmer.nb with identical syntax but no family for negative binomial.

glmmadmb(count ~ AirT + I(AirT^2) + RainAmt24 + I(RainAmt24^2) + RHpct + windspeed + sin(2*pi/360*DOY) + cos(2*pi/360*DOY) + (1|plot),
family="nbinom", zeroInflation=FALSE, data=Data)

# glmmPQL, requires an estimate for theta which can be obtained from a
# glm model in which the correlation structure is ignored.
library(MASS)
glmmPQL(count ~ AirT + I(AirT^2) + RainAmt24 + I(RainAmt24^2) + RHpct + windspeed + sin(2*pi/360*DOY) + cos(2*pi/360*DOY) , random = list(~1 | plot), data = Data,family = negative.binomial(theta = 4.22, link = log))
``````

These links may also be of assistance:

Both are by Ben Bolker, author of lme4.

I haven’t tested the examples but they should give you an idea of where to start. Please supply data if you wish to verify their implementation.