Explanation for non-integer degrees of freedom in t test with unequal variances

The SPSS t-Test procedure reports 2 analyses when comparing 2 independent means, one analysis with equal variances assumed and one with equal variances not assumed. The degrees of freedom (df) when equal variances are assumed are always integer values (and equal n-2). The df when equal variances are not assumed are non-integer (e.g., 11.467) and nowhere near n-2. I am seeking an explanation of the logic and method used to calculate these non-integer df’s.

The Welch-Satterthwaite d.f. can be shown to be a scaled weighted harmonic mean of the two degrees of freedom, with weights in proportion to the corresponding standard deviations.

$$νW=(s21n1+s22n2)2s41n21ν1+s42n22ν2\nu_{_W} = \frac{\left(\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}\right)^2}{\frac{s_1^4}{n_1^2\nu_1}+\frac{s_2^4}{n_2^2\nu_2}}$$

Note that $$ri=s2i/nir_i=s_i^2/n_i$$ is the estimated variance of the $$ithi^\text{th}$$ sample mean or the square of the $$ii$$-th standard error of the mean. Let $$r=r1/r2r=r_1/r_2$$ (the ratio of the estimated variances of the sample means), so

νW=(r1+r2)2r21ν1+r22ν2=(r1+r2)2r21+r22r21+r22r21ν1+r22ν2=(r+1)2r2+1r21+r22r21ν1+r22ν2\begin{align} \nu_{_W} &= \frac{\left(r_1+r_2\right)^2}{\frac{r_1^2}{\nu_1}+\frac{r_2^2}{\nu_2}} \newline \newline &=\frac{\left(r_1+r_2\right)^2}{r_1^2+r_2^2}\frac{r_1^2+r_2^2}{\frac{r_1^2}{\nu_1}+\frac{r_2^2}{\nu_2}} \newline \newline &=\frac{\left(r+1\right)^2}{r^2+1}\frac{r_1^2+r_2^2}{\frac{r_1^2}{\nu_1}+\frac{r_2^2}{\nu_2}} \end{align}

The first factor is $$1+sech(log(r))1+\text{sech}(\log(r))$$, which increases from $$11$$ at $$r=0r=0$$ to $$22$$ at $$r=1r=1$$ and then decreases to $$11$$ at $$r=∞r=\infty$$; it’s symmetric in $$logr\log r$$.

The second factor is a weighted harmonic mean:

$$H(x_)=∑ni=1wi∑ni=1wixi.H(\underline{x})=\frac{\sum_{i=1}^n w_i }{ \sum_{i=1}^n \frac{w_i}{x_i}}\,.$$

of the d.f., where $$wi=r2iw_i=r_i^2$$ are the relative weights to the two d.f.

Which is to say, when $$r1/r2r_1/r_2$$ is very large, it converges to $$ν1\nu_1$$. When $$r1/r2r_1/r_2$$ is very close to $$00$$ it converges to $$ν2\nu_2$$. When $$r1=r2r_1=r_2$$ you get twice the harmonic mean of the d.f., and when $$s21=s22s_1^2=s_2^2$$ you get the usual equal-variance t-test d.f., which is also the maximum possible value for $$νW\nu_{_W}$$ (given the sample sizes).

With an equal-variance t-test, if the assumptions hold, the square of the denominator is a constant times a chi-square random variate.

The square of the denominator of the Welch t-test isn’t (a constant times) a chi-square; however, it’s often not too bad an approximation. A relevant discussion can be found here.

A more textbook-style derivation can be found here.