I have a data set where I measured the number of molecules (M) present in cells as a function of drug (with or without) and days of treatment (5 timepoints). I repeated the experiment 3 times, with cells from a separate donor each time. I am currently trying to compare the means between groups. However, the data are not normal and heteroskedastic and I’m in the process of figuring out how to best deal with this.

Transforming the data makes the dataset normal, but the heteroskedacity remains. I’m a stats novice, but my reading over the past several days suggests that a linear mixed model should be able to deal with this. Based primarily on the Pinhiero/Bates book, I have cobbled together the following models using

`lme`

in`R`

; they are the same except for the`varPower`

statement.`model1 <- lme(sqrt(M) ~ drug + Days + drug*Days, random = ~ 1+drug+Days+drug*Days|Donor, data=D) model2 <- lme(sqrt(M) ~ drug + Days + drug*Days, random = ~ 1+drug+Days+drug*Days|Donor, data=D, varPower(form = ~fitted(.)) )`

When I compare these two models using

`anova()`

,`model2`

has a significantly increased log likelihood. However, when I examine the standardized residuals plotted against either fitted values or the independent variables, the graphs for the two models have identical shape. Note that the magnitude of the residuals is slightly greater with`varPower()`

included:`plot(model1, resid(., type="p") ~ fitted(.), abline=0) plot(model1, resid(., type="p") ~ Days|drug, abline=0)`

Does the similarity between these plots mean that I have failed to sufficiently account for the unequal variances between groups?

If so, what approaches might yield sufficient correction? Additionally, if you have any general comments about the suitability of

`lme`

here, or the structure of the model, those would be welcome as well!

**Answer**

A couple of points

- The use of
`varPower()`

seems incorrect. You need to pass it to the`weights`

argument of`lme()`

. - The shape of the residuals suggests that you have a bounded outcome, with many values at or near the boundary. You could instead consider a Beta mixed effects model for a bounded outcome or mixed model for semi-continuous data.

**Attribution***Source : Link , Question Author : happenstance1 , Answer Author : Dimitris Rizopoulos*