How should one control for group and individual differences in pre-treatment scores in a randomised controlled trial?

Andrew Gelman, in the book he wrote with Jennifer Hill, states in Chapter 9, (section 9.3), on page 177:

It is only appropriate to control for pre-treatment predictors, or, more generally, predictors that would not be affected by the treatment (such as race or age). This point willl be illustrated more concretely in Section 9.7…

And there (9.7 is entitled “do not control for post treatment variables”) he discusses the problem of measuring mediating variables, rather than the pre-post change problem directly.

It is important to state here that I think Gelman/Hill is a brilliant text… And I’m thoroughly enjoying understanding it. However, this bit piqued my interest, as it brings to mind Everitt & Pickles’s approach to the same problem.

Everitt is of the opinion that using a change score (Score B – Score A) will tend to bias your findings in favour of the treatment, whereas including baseline scores in the model is more conservative. They back this up with a simulation – it’s pretty persuasive.

My understanding up to here has been that what you are controlling for is group differences in baseline scores that might cause the apparent treatment effect to be greater than it is, or to exist, when it does not. It is also my understanding that this is because regression to the mean is at work, so that higher baseline scores will be associated with greater decreases and vice versa, independent of the treatment effect.

Everitt is strenuously against “change scores”, and Gelman seems to be advising against Including baseline scores in the model.

However, Gelman demonstrates this over the next 2-3 pages, including pre-test scores as a predictor. He gives the caveat that you then get a range of plausible treatment effects that are conditional on the pre-test score, not a range of treatment effects representing merely uncertainty in the effects.

My opinion is that using “change scores” seems not to really be doing anything about regression to the mean, whereas including the baseline score as a predictor allows baseline group differences to cancel out, essentially introducing a covariance structure.

I’m a doctor and I have to make real decisions about which treatments work. So what should I do? Include each person’s baseline scores or use “change scores”?


{I’m cheating, adding a comment too long for the comment box.} Thanks for your explication. Sounds as if you’ve found some great sources, and done a lot to extract good lessons from them. There are other sources worth reading, e.g., a chapter in Cook and Campbell’s Quasi- Experimentation; a section in Geoffrey Keppel’s Design and Analysis; and I think at least one article by Donald Rubin. I’ll also offer a lesson I’ve gleaned (paraphrased) from Damian Betebenner’s work on student test scores:

Is it reasonable to expect that no improvement would occur absent a certain intervention? If so, it makes sense to analyze gain scores, as with analysis of variance. Is it instead reasonable to think that all students would improve to some degree even without the intervention, and that their posttest score could be predicted as a linear function of their pretest score? If so, analysis of covariance would make sense.

from ANOVA/ANCOVA Flow Chart

Also, perhaps you know this, but Lord’ s Paradox, referred to by Betebenner, involves the possibility of obtaining, with the same data, a result of zero mean difference using one of these two methods but a significant difference using the other.

My take, based on readings perhaps more limited than yours, is that both methods have a place and that Everitt and perhaps also Gelman, great as they are, are in this case taking too hard a line.

Source : Link , Question Author : Community , Answer Author : rolando2

Leave a Comment