# Standard error of random effects in R (lme4) vs Stata (xtmixed)

dt.m <- structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), occasion = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("g1", "g2"), class = "factor"),     g = c(12, 8, 22, 10, 10, 6, 8, 4, 14, 6, 2, 22, 12, 7, 24, 14, 8, 4, 5, 6, 14, 5, 5, 16)), .Names = c("id", "occasion", "g"), row.names = c(NA, -24L), class = "data.frame")


We fit a simple variance components model. In R we have:

require(lme4)
fit.vc <- lmer( g ~ (1|id), data=dt.m )


Then we produce a caterpillar plot:

rr1 <- ranef(fit.vc, postVar = TRUE)
dotplot(rr1, scales = list(x = list(relation = 'free')))[["id"]]


Now we fit the same model in Stata. First write to Stata format from R:

require(foreign)
write.dta(dt.m, "dt.m.dta")


In Stata

use "dt.m.dta"
xtmixed g || id:, reml variance


The output agrees with the R output (neither shown), and we attempt to produce the same caterpillar plot:

predict u_plus_e, residuals
predict u, reffects
gen e = u_plus_e – u
predict u_se, reses

egen tag = tag(id)
sort u
gen u_rank = sum(tag)

serrbar u u_se u_rank if tag==1, scale(1.96) yline(0)


Clearty Stata is using a different standard error to R. In fact Stata is using 2.13 whereas R is using 1.32.

From what I can tell, the 1.32 in R is coming from

> sqrt(attr(ranef(fit.vc, postVar = TRUE)[[1]], "postVar")[1, , ])
[1] 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977 1.319977


though I can’t say I really understand what this is doing. Can someone explain ?

And I have no idea where the 2.13 from Stata is coming from, except that, if I change the estimation method to maximum likelihood:

xtmixed g || id:, ml variance


….then it seems to use 1.32 as the standard error and produce the same results as R….

…. but then the estimate for the random effect variance no longer agrees with R (35.04 vs 31.97).

So it seems to have something to do with ML vs REML: If I run REML in both systems, the model output agrees but the standard errors used in the caterpillar plots don’t agree, whereas if I run REML in R and ML in Stata, the caterpillar plots agree, but the model estimates do not.

Can anyone explain what is going on ?

According to the [XT] manual for Stata 11:

Standard errors for BLUPs are calculated based on the iterative
technique of Bates and Pinheiro (1998, sec. 3.3) for estimating the
BLUPs themselves. If estimation is done by REML, these standard errors
account for uncertainty in the estimate of $\beta$, while for ML the
standard errors treat $\beta$ as known. As such, standard errors of
REML-based BLUPs will usually be larger.

As the Stata ML standard errors match the standard errors from R in your example, it seems R is not accounting for uncertainty in estimating $\beta$. Whether it should, I don’t know.

From your question, you have tried REML in both Stata and R, and ML in Stata with REML in R. If you try ML in both, you should get the same results in both.