Why do we stabilize variance?

I came across variance stabilizing transformation while reading Kaggle Essay Eval method. They use a variance stabilization transformation to transform kappa values before taking their mean and then transform them back. Even after reading the wiki on variance stabilizing transforms I can’t understand, why do we actually stabilize variances? What benefit do we gain by this?


Here’s one answer: usually, the most efficient way to conduct statistical inference is when your data are i.i.d. If they are not, you are getting different amounts of information from different observations, and that’s less efficient. Another way to view that is to say that if you can add extra information to your inference (i.e., the functional form of the variance, via the variance-stabilizing transformation), you will generally improve the accuracy of your estimates, at least asymptotically. In very small samples, bothering with modeling of variance may increase your small sample bias. This is a sort of econometric GMM-type argument: if you add additional moments, your asymptotic variance cannot go up; and your finite sample bias increases with the overidentified degrees of freedom.

Another answer was given by cardinal: if you have an unknown variance hanging around in your asymptotic variance expression, the convergence onto the asymptotic distribution will be slower, and you would have to estimate that variance somehow. Pre-pivoting your data or your statistics usually helps improve the accuracy of asymptotic approximations.

Source : Link , Question Author : Pushpendre , Answer Author : pyrrhic

Leave a Comment