Comparing mixed effect models with the same number of degrees of freedom

I have an experiment that I’ll try to abstract here. Imagine I toss three white stones in front of you and ask you to make a judgment about their position. I record a variety of properties of the stones and your response. I do this over a number of subjects. I generate two models. One is that the nearest stone to you predicts your response, and the other is that the geometric center of the stones predicts your response. So, using lmer in R I could write.

mNear   <- lmer(resp ~ nearest + (1|subject), REML = FALSE)
mCenter <- lmer(resp ~ center  + (1|subject), REML = FALSE)

UPDATE AND CHANGE – more direct version that incorporates several helpful comments

I could try

anova(mNear, mCenter)

Which is incorrect, of course, because they’re not nested and I can’t really compare them that way. I was expecting anova.mer to throw an error but it didn’t. But the possible nesting that I could try here isn’t natural and still leaves me with somewhat less analytical statements. When models are nested naturally (e.g. quadratic on linear) the test is only one way. But in this case what would it mean to have asymmetric findings?

For example, I could make a model three:

mBoth <- lmer(resp ~ center + nearest + (1|subject), REML = FALSE)

Then I can anova.

anova(mCenter, mBoth)
anova(mNearest, mBoth)

This is fair to do and now I find that the center adds to the nearest effect (the second command) but BIC actually goes up when nearest is added to center (correction for the lower parsimony). This confirms what was suspected.

But is finding this sufficient? And is this fair when center and nearest are so highly correlated?

Is there a better way to analytically compare the models when it’s not about adding and subtracting explanatory variables (degrees of freedom)?


Still, you can compute confidence intervals for your fixed effects, and report AIC or BIC (see e.g. Cnann et al., Stat Med 1997 16: 2349).

Now, you may be interested in taking a look at Assessing model mimicry using the parametric bootstrap, from Wagenmakers et al. which seems to more closely resemble your initial question about assessing the quality of two competing models.

Otherwise, the two papers about measures of explained variance in LMM that come to my mind are:

  • Lloyd J. Edwards, Keith E. Muller, Russell D. Wolfinger, Bahjat F. Qaqish and Oliver Schabenberger (2008). An R2 statistic for fixed effects in the linear mixed model, Statistics in Medicine, 27(29), 6137–6157.
  • Ronghui Xu (2003). Measuring explained variation in linear mixed effects models, Statistics in Medicine, 22(22), 3527–3541.

But maybe there are better options.

Source : Link , Question Author : John , Answer Author : chl

Leave a Comment