In the Wikipedia article on ANOVA, it says
In its simplest form, ANOVA provides a statistical test of whether or not the means of several groups are equal, and therefore generalizes the t-test to more than two groups.
My understanding of this is that ANOVA is the same as t-test when it comes to a two-group comparison.
However, in my simple example below (in R), ANOVA and t-test give similar but slightly different p-values. Can anyone explain why?
x1=rnorm(100,mean=0,sd=1) x2=rnorm(100,mean=0.5,sd=1) y1=rnorm(100,mean=0,sd=10) y2=rnorm(100,mean=0.5,sd=10) t.test(x1,x2)$p.value # 0.0002695961 t.test(y1,y2)$p.value # 0.8190363 df1=as.data.frame(rbind(cbind(x=x1,type=1), cbind(x2,type=2))) df2=as.data.frame(rbind(cbind(x=y1,type=1), cbind(y2,type=2))) anova(lm(x~type,df1))$`Pr(>F)` # 0.0002695578 anova(lm(x~type,df2))$`Pr(>F)` # 0.8190279
By default the argument
lm(), the residuals are supposed to have constant variance.
Thus, by setting
var.equal = TRUE in
t.test(), you should get the same result.
var.equals indicates whether to treat the two variances as being equal. If TRUE then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used.