Would a $(1-\alpha)100\%$ confidence interval for the variance be narrower if we knew the mean a-priori?

Let’s say we know the mean of a given distribution. Does this affect the interval estimate of the variance of a random variable (which is otherwise computed using the sample variance)? As in, can we obtain a smaller interval for the same confidence level?

Answer

I am not entirely sure my answer is correct, but I would argue there is no general relationship. Here is my point:

Let us study the case where the confidence interval of the variance is well-understood, viz. sampling from a normal distribution (as you indicate in the tag of the question, but not really the question itself). See the discussion here and here.

A confidence interval for $\sigma^2$ follows from the pivot $T=n\hat{\sigma}^2/\sigma^2\sim\chi^2_{n-1}$, where $\hat{\sigma}^2=1/n\sum_i(X_i-\bar{X})^2$. (This is just another way of writing the possibly more familiar expression $T=(n-1)s^2/\sigma^2\sim\chi^2_{n-1}$, where $s^2=1/(n-1)\sum_i(X_i-\bar{X})^2$.)

We thus have
\begin{align*}
1-\alpha&=\Pr\{c_l^{n-1}<T<c_u^{n-1}\}\\
&=\Pr\left\{\frac{c_l^{n-1}}{n\hat{\sigma}^2}<\frac{1}{\sigma^2}<\frac{c_u^{n-1}}{n\hat{\sigma}^2}\right\}\\
&=\Pr\left\{\frac{n\hat{\sigma}^2}{c_u^{n-1}}<\sigma^2<\frac{n\hat{\sigma}^2}{c_l^{n-1}}\right\}
\end{align*}
Hence, a confidence interval is $(n\hat{\sigma}^2/c_u^{n-1},n\hat{\sigma}^2/c_l^{n-1})$. We may choose $c_l^{n-1}$ and $c_u^{n-1}$ as the quantiles $c_u^{n-1}=\chi^2_{n-1,1-\alpha/2}$ and $c_l^{n-1}=\chi^2_{n-1,\alpha/2}$.

(Notice in passing that for whichever variance estimate that, as the $\chi^2$-distribution is skewed, the quantiles will yield a c.i. with the right coverage probability, but not be optimal, i.e. not be the shortest possible ones. For a confidence interval to be as short as possible, we require the density to be identical at the lower and upper end of the c.i., given some additional conditions like unimodality. I do not know if using that optimal c.i. would change things in this answer.)

As explained in the links, $T’=ns_0^2/\sigma^2\sim\chi^2_n$, where $s_0^2=\frac{1}{n}\sum_i(X_i-\mu)^2$ uses the known mean. Hence, we get another valid confidence interval
\begin{align*}
1-\alpha&=\Pr\{c_l^{n}<T'<c_u^{n}\}\\
&=\Pr\left\{\frac{ns_0^2}{c_u^{n}}<\sigma^2<\frac{ns_0^2}{c_l^{n}}\right\}
\end{align*}
Here, $c_l^{n}$ and $c_u^{n}$ will thus be quantiles from the $\chi^2_n$-distribution.

The widths of the confidence intervals are
$$
w_T=\frac{n\hat{\sigma}^2(c_u^{n-1}-c_l^{n-1})}{c_l^{n-1}c_u^{n-1}}
$$
and
$$
w_{T’}=\frac{ns_0^2(c_u^{n}-c_l^{n})}{c_l^{n}c_u^{n}}
$$
The relative width is
$$
\frac{w_T}{w_{T’}}=\frac{\hat{\sigma}^2}{s_0^2}\frac{c_u^{n-1}-c_l^{n-1}}{c_u^{n}-c_l^{n}}\frac{c_l^{n}c_u^{n}}{c_l^{n-1}c_u^{n-1}}
$$
We know that $\hat{\sigma}^2/s_0^2\leq1$ as the sample mean minimizes the sum of squared deviations. Beyond that, I see few general results regarding the width of the interval, as I am not aware of clear-cut results how differences and products of upper and lower $\chi^2$ quantiles behave as we increase degrees of freedom by one (but see the figure below).

For example, letting

$$
r_n:=\frac{c_u^{n-1}-c_l^{n-1}}{c_u^{n}-c_l^{n}}\frac{c_l^{n}c_u^{n}}{c_l^{n-1}c_u^{n-1}},$$
we have

$$r_{10}\approx1.226$$
for $\alpha=0.05$ and $n=10$, meaning that the c.i. based on $\hat{\sigma}^2$ will be shorter if
$$
\hat{\sigma}^2\leq\frac{s_0^2}{1.226}
$$

Using the code below, I ran a little simulation study suggesting that the interval based on $s_0^2$ will win most of the time. (See the link posted in Aksakal’s answer for a large-sample rationalization of this result.)

The probability seems to stabilize in $n$, but I am not aware of an analytical finite-sample explanation:

enter image description here

    rm(list=ls())

IntervalLengthsSigma2 <- function(n,alpha=0.05,reps=100000,mu=1) {
  cl_a <- qchisq(alpha/2,df = n-1)
  cu_a <- qchisq(1-alpha/2,df = n-1)
  cl_b <- qchisq(alpha/2,df = n)
  cu_b <- qchisq(1-alpha/2,df = n)

  winners02 <- rep(NA,reps)

  for (i in 1:reps) {
    x <- rnorm(n,mean=mu)
    xbar <- mean(x)
    s2 <- 1/n*sum((x-xbar)^2)
    s02 <- 1/n*sum((x-mu)^2)

    ci_a <- c(n*s2/cu_a,n*s2/cl_a)
    ci_b <- c(n*s02/cu_b,n*s02/cl_b)

    winners02[i] <- ifelse(ci_a[2]-ci_a[1]>ci_b[2]-ci_b[1],1,0)  
  }
  mean(winners02)
}

nvalues <- matrix(seq(5,200,by=10)) 
plot(nvalues,apply(nvalues,1,IntervalLengthsSigma2),pch=19,col="lightblue",type="b")

The next figure plots $r_n$ against $n$, revealing (as intuition would suggest) that the ratio tends to 1. As, moreover, $\bar{X}\to_p\mu$ for $n$ large, the difference between the widths of the two c.i.s will therefore vanish as $n\to\infty$. (See again the link posted in Aksakal’s answer for a large-sample rationalization of this result.)

enter image description here

Attribution
Source : Link , Question Author : martianwars , Answer Author : Community

Leave a Comment