Reading about the true meaning of 95% confidence ellipse, I tend to come across 2 explanations :

- The ellipse that contains 95% of the data
- Not the above, but the ellipse that explains the variance of the data. I am not sure I understand correctly but they seem to mean that if a new data point coming in, there is a 95% chance that the new variance will stay in the ellipse.
Can you shed some light?

**Answer**

Actually, neither explanation is correct.

A confidence ellipse has to do with *unobserved population parameters*, like the true population mean of your bivariate distribution. A 95% confidence ellipse for this mean is really an algorithm with the following property: if you were to replicate your sampling from the underlying distribution many times and each time calculate a confidence ellipse, then 95% of the ellipses so constructed would contain the underlying mean. (Note that each sample would of course yield a different ellipse.)

Thus, a confidence ellipse will usually *not* contain 95% of the observations. In fact, as the number of observations increases, the mean will usually be better and better estimated, leading to smaller and smaller confidence ellipses, which in turn contain a smaller and smaller proportion of the actual data. (Unfortunately, some people calculate the smallest ellipse that contains 95% of their data, reminiscent of a quantile, which by itself is quite OK… but then go on to call this “quantile ellipse” a “confidence ellipse”, which, as you see, leads to confusion.)

The variance of the underlying population relates to the confidence ellipse. High variance will mean that the data are all over the place, so the mean is not well estimated, so the confidence ellipse will be larger than if the variance were smaller.

Of course, we can calculate confidence ellipses also for any other population parameter we may wish to estimate. Or we could look at other confidence regions than ellipses, especially if we don’t know the estimated parameter to be (asymptotically) normally distributed.

The one-dimensional analogue of the confidence ellipse is the confidence-interval, and browsing through previous questions in this tag is helpful. Our current top-voted question in this tag is particularly nice: Why does a 95% CI not imply a 95% chance of containing the mean? Most of the discussion there holds just as well for higher dimensional analogues of the one-dimensional confidence interval.

**Attribution***Source : Link , Question Author : Kenny , Answer Author : Community*