# Which multiple comparison method to use for a lmer model: lsmeans or glht?

I’m analyzing a data set using a mixed effects model with one fixed effect (condition) and two random effects (participant due to the within subject design and pair). The model was generated with the `lme4` package: `exp.model<-lmer(outcome~condition+(1|participant)+(1|pair),data=exp)`.

Next, I performed a likelihood ratio test of this model against the model without the fixed effect (condition) and have a significant difference. There are 3 conditions in my data set so I want to do a multiple comparison but I am not sure which method to use. I found a number of similar questions on CrossValidated and other forums but I am still quite confused.

From what I’ve seen, people have suggested using

1. The `lsmeans` package – `lsmeans(exp.model,pairwise~condition)` which gives me the following output:

``````condition     lsmean         SE    df  lower.CL  upper.CL
Condition1 0.6538060 0.03272705 47.98 0.5880030 0.7196089
Condition2 0.7027413 0.03272705 47.98 0.6369384 0.7685443
Condition3 0.7580522 0.03272705 47.98 0.6922493 0.8238552

Confidence level used: 0.95

\$contrasts
contrast                   estimate         SE    df t.ratio p.value
Condition1 - Condition2 -0.04893538 0.03813262 62.07  -1.283  0.4099
Condition1 - Condition3 -0.10424628 0.03813262 62.07  -2.734  0.0219
Condition2 - Condition3 -0.05531090 0.03813262 62.07  -1.450  0.3217

P value adjustment: tukey method for comparing a family of 3 estimates
``````

2. The `multcomp` package in two different ways – using `mcp` `glht(exp.model,mcp(condition="Tukey"))` resulting in

``````     Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts

Fit: lmer(formula = outcome ~ condition + (1 | participant) + (1 | pair),
data = exp, REML = FALSE)

Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
Condition2 - Condition1 == 0  0.04894    0.03749   1.305    0.392
Condition3 - Condition1 == 0  0.10425    0.03749   2.781    0.015 *
Condition3 - Condition2 == 0  0.05531    0.03749   1.475    0.303
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)
``````

and using `lsm` `glht(exp.model,lsm(pairwise~condition))` resulting in

``````Note: df set to 62

Simultaneous Tests for General Linear Hypotheses

Fit: lmer(formula = outcome ~ condition + (1 | participant) + (1 | pair),
data = exp, REML = FALSE)

Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
Condition1 - Condition2 == 0 -0.04894    0.03749  -1.305   0.3977
Condition1 - Condition3 == 0 -0.10425    0.03749  -2.781   0.0195 *
Condition2 - Condition3 == 0 -0.05531    0.03749  -1.475   0.3098
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)
``````

As you can see, the methods give different results. This is my first time working with R and stats so something might be going wrong but I wouldn’t know where. My questions are:

What are the differences between the presented methods? I read in an answer to a related questions that it’s about the degrees of freedom (`lsmeans` vs. `glht`).
Are there some rules or recommendations when to use which one, i.e., method 1 is good for this type of data set/model etc.? Which result should I report? Without knowing better I’d probably just go and report the highest p-value I got to play it safe but it would be nice to have a better reason. Thanks

## Answer

Not a complete answer…

The difference between `glht(myfit, mcp(myfactor="Tukey"))` and the two other methods is that this way uses a “z” statistic (normal distribution), whereas the other ones use a “t” statistic (Student distribution). The “z” statistic it the same as a “t” statistic with an infinite degree of freedom. This method is an asymptotic one and it provides smaller p-values and shorter confidence intervals than the other ones. The p-values can be too small and the confidence intervals can be too short if the dataset is small.

When I run `lsmeans(myfit, pairwise~myfactor)` the following message appears:

``````Loading required namespace: pbkrtest
``````

That means that `lsmeans` (for a `lmer` model) uses the `pbkrtest` package which implements the Kenward & Rogers method for the degrees of freedom of the “t” statistic. This method intends to provide better p-values and confidence intervals than the asymptotic one (there’s no difference when the degree of freedom is large).

Now, about the difference between `lsmeans(myfit, pairwise~myfactor)\$contrasts` and `glht(myfit, lsm(pairwise~factor)`, I have just done some tests and my observations are the following ones:

• `lsm` is an interface between the `lsmeans` package and the `multcomp` package (see `?lsm`)

• for a balanced design there’s no difference between the results

• for an unbalanced design, I observed small differences between the results (the standard errors and the t ratio)

Unfortunately I do not know what is the cause of these differences. It looks like `lsm` calls `lsmeans` only to get the linear hypotheses matrix and the degrees of freedom, but `lsmeans` uses a different way to calculate the standard errors.

Attribution
Source : Link , Question Author : schvaba986 , Answer Author : Stéphane Laurent