Interpreting three forms of a “mixed model”

There’s a distinction that’s tripping me up with mixed models, and I’m wondering if I could get some clarity on it. Let’s assume you’ve got a mixed model of count data. There’s a variable you know you want as a fixed effect (A) and another variable for time (T), grouped by say a “Site” variable.

As I understand it:

glmer(counts ~ A + T, data=data, family="Poisson") is a fixed effects model.

glmer(counts ~ (A + T | Site), data=data, family="Poisson") is a random effect model.

My question is when you have something like:

glmer(counts ~ A + T + (T | Site), data=data, family="Poisson") what is T? Is it a random effect? A fixed effect? What’s actually being accomplished by putting T in both places?

When should something only appear in the random effects section of the model formula?

Answer

This may become clearer by writing out the model formula for each of these three models. Let Yij be the observation for person i in site j in each model and define Aij,Tij analogously to refer to the variables in your model.

glmer(counts ~ A + T, data=data, family="Poisson") is the model

log(E(Yij))=β0+β1Aij+β2Tij

which is just an ordinary poisson regression model.

glmer(counts ~ (A + T|Site), data=data, family="Poisson") is the model

log(E(Yij))=α0+ηj0+ηj1Aij+ηj2Tij

where ηj=(ηj0,ηj1,ηj2)N(0,Σ) are random effects that are shared by each observation made by individuals from site j. These random effects are allowed to be freely correlated (i.e. no restrictions are made on Σ) in the model you specified. To impose independence, you have to place them inside different brackets, e.g. (A-1|Site) + (T-1|Site) + (1|Site) would do it. This model assumes that log(E(Yij)) is α0 for all sites but each site has a random offset (ηj0) and has a random linear relationship with both Aij,Tij.

glmer(counts ~ A + T + (T|Site), data=data, family="Poisson") is the model

log(E(Yij))=(θ0+γj0)+θ1Aij+(θ2+γj1)Tij

So now log(E(Yij)) has some “average” relationship with Aij,Tij, given by the fixed effects θ0,θ1,θ2 but that relationship is different for each site and those differences are captured by the random effects, γj0,γj1,γj2. That is, the baseline is random shifted and the slopes of the two variables are randomly shifted and everyone from the same site shares the same random shift.

what is T? Is it a random effect? A fixed effect? What’s actually being accomplished by putting T in both places?

T is one of your covariates. It is not a random effect – Site is a random effect. There is a fixed effect of T that is different depending on the random effect conferred by Siteγj1 in the model above. What is accomplished by including this random effect is to allow for heterogeneity between sites in the relationship between T and log(E(Yij)).

When should something only appear in the random effects section of the model formula?

This is a matter of what makes sense in the context of the application.

Regarding the intercept – you should keep the fixed intercept in there for a lot of reasons (see, e.g., here); re: the random intercept, γj0, this primarily acts to induce correlation between observations made at the same site. If it doesn’t make sense for such correlation to exist, then the random effect should be excluded.

Regarding the random slopes, a model with only random slopes and no fixed slopes reflects a belief that, for each site, there is some relationship between log(E(Yij)) and your covariates for each site, but if you average those effects over all sites, then there is no relationship. For example, if you had a random slope in T but no fixed slope, this would be like saying that time, on average, has no effect (e.g. no secular trends in the data) but each Site is heading in a random direction over time, which could make sense. Again, it depends on the application.

Note that you can fit the model with and without random effects to see if this is happening – you should see no effect in the fixed model but significant random effects in the subsequent model. I must caution you that decisions like this are often better made based on an understanding of the application rather than through model selection.

Attribution
Source : Link , Question Author : Fomite , Answer Author : Community

Leave a Comment