How to decide whether to set REML to True or False?

I have found a web page telling that for lmer:

If your random effects are nested, or you have only one random effect, and if your data are balanced (i.e., similar sample sizes in each factor group) set REML to FALSE, because you can use maximum likelihood. If your random effects are crossed, don’t set the REML argument because it defaults to TRUE anyway.

I have 2 random effects in my lmer model. One is nested:

(1|Random1A/Random1B) + (1|Random2)

Should I set REML to TRUE (by default) or FALSE?

Answer

In my (not entirely uninformed) opinion you’re getting some questionable advice, from the web page and from the comments you received.

  • you can use REML (or ML) whenever you want (regardless of the random effects structure – single vs. multiple, balanced vs. unbalanced, crossed vs. nested)
  • in simple cases (balanced/nested/etc.) REML can be proven to provide unbiased estimates of variance components (but not unbiased estimates of e.g. standard deviation or log standard deviation)
  • you cannot compare models that differ in fixed effects if they are fitted by REML rather than ML; this is why the commenter recommends that you use REML=FALSE if you’re trying to do model selection
  • however, I wouldn’t recommend you do model selection in the first place, certainly not if you’re going to rely on the conditional confidence intervals and p-values (i.e., analyzing the refitted ‘minimal adequate’ model without accounting for the effects of model selection)

From my chapter in Fox et al 2015:

It’s generally good to use REML, if it is available, when you are interested in the magnitude of the random effects variances, but never when you are comparing models with different fixed effects via hypothesis tests or information-theoretic criteria such as AIC.

Attribution
Source : Link , Question Author : borgs , Answer Author : Ben Bolker

Leave a Comment