Standard error of random effects in R (lme4) vs Stata (xtmixed)

Please consider this data:

dt.m <- structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), occasion = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("g1", "g2"), class = "factor"),     g = c(12, 8, 22, 10, 10, 6, 8, 4, 14, 6, 2, 22, 12, 7, 24, 14, 8, 4, 5, 6, 14, 5, 5, 16)), .Names = c("id", "occasion", "g"), row.names = c(NA, -24L), class = "data.frame")

We fit a simple variance components model. In R we have:

require(lme4) <- lmer( g ~ (1|id), data=dt.m )

Then we produce a caterpillar plot:

rr1 <- ranef(, postVar = TRUE)
dotplot(rr1, scales = list(x = list(relation = 'free')))[["id"]]

Caterpillar plot from R

Now we fit the same model in Stata. First write to Stata format from R:

write.dta(dt.m, "dt.m.dta")

In Stata

use "dt.m.dta"
xtmixed g || id:, reml variance

The output agrees with the R output (neither shown), and we attempt to produce the same caterpillar plot:

predict u_plus_e, residuals
predict u, reffects
gen e = u_plus_e – u
predict u_se, reses

egen tag = tag(id)
sort u
gen u_rank = sum(tag)

serrbar u u_se u_rank if tag==1, scale(1.96) yline(0)

enter image description here

Clearty Stata is using a different standard error to R. In fact Stata is using 2.13 whereas R is using 1.32.

From what I can tell, the 1.32 in R is coming from

> sqrt(attr(ranef(, postVar = TRUE)[[1]], "postVar")[1, , ])
 [1] 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977

though I can’t say I really understand what this is doing. Can someone explain ?

And I have no idea where the 2.13 from Stata is coming from, except that, if I change the estimation method to maximum likelihood:

xtmixed g || id:, ml variance

….then it seems to use 1.32 as the standard error and produce the same results as R….

enter image description here

…. but then the estimate for the random effect variance no longer agrees with R (35.04 vs 31.97).

So it seems to have something to do with ML vs REML: If I run REML in both systems, the model output agrees but the standard errors used in the caterpillar plots don’t agree, whereas if I run REML in R and ML in Stata, the caterpillar plots agree, but the model estimates do not.

Can anyone explain what is going on ?


According to the [XT] manual for Stata 11:

Standard errors for BLUPs are calculated based on the iterative
technique of Bates and Pinheiro (1998, sec. 3.3) for estimating the
BLUPs themselves. If estimation is done by REML, these standard errors
account for uncertainty in the estimate of β, while for ML the
standard errors treat β as known. As such, standard errors of
REML-based BLUPs will usually be larger.

As the Stata ML standard errors match the standard errors from R in your example, it seems R is not accounting for uncertainty in estimating β. Whether it should, I don’t know.

From your question, you have tried REML in both Stata and R, and ML in Stata with REML in R. If you try ML in both, you should get the same results in both.

Source : Link , Question Author : Robert Long , Answer Author : timbp

Leave a Comment