Does standardising independent variables reduce collinearity?

I’ve come across a very good text on Bayes/MCMC. IT suggests that a standardisation of your independent variables will make an MCMC (Metropolis) algorithm more efficient, but also that it may reduce (multi)collinearity. Can that be true? Is this something I should be doing as standard.(Sorry).

Kruschke 2011, Doing Bayesian Data Analysis. (AP)

edit: for example

     > data(longley)
     > cor.test(longley$Unemployed, longley$Armed.Forces)

Pearson's product-moment correlation

     data:  longley$Unemployed and longley$Armed.Forces 
     t = -0.6745, df = 14, p-value = 0.5109
     alternative hypothesis: true correlation is not equal to 0 
     95 percent confidence interval:
     -0.6187113  0.3489766 
     sample estimates:
      cor 
     -0.1774206 

     > standardise <- function(x) {(x-mean(x))/sd(x)}
     > cor.test(standardise(longley$Unemployed), standardise(longley$Armed.Forces))

Pearson's product-moment correlation

     data:  standardise(longley$Unemployed) and standardise(longley$Armed.Forces) 
     t = -0.6745, df = 14, p-value = 0.5109
      alternative hypothesis: true correlation is not equal to 0 
     95 percent confidence interval:
      -0.6187113  0.3489766 
      sample estimates:
       cor 
     -0.1774206 

This hasn’t reduced the correlation or therefore the albeit limited linear dependence of vectors.

What’s going on?

R

Answer

It doesn’t change the collinearity between the main effects at all. Scaling doesn’t either. Any linear transform won’t do that. What it changes is the correlation between main effects and their interactions. Even if A and B are independent with a correlation of 0, the correlation between A, and A:B will be dependent upon scale factors.

Try the following in an R console. Note that rnorm just generates random samples from a normal distribution with population values you set, in this case 50 samples. The scale function standardizes the sample to a mean of 0 and SD of 1.

set.seed(1) # the samples will be controlled by setting the seed - you can try others
a <- rnorm(50, mean = 0, sd = 1)
b <- rnorm(50, mean = 0, sd = 1)
mean(a); mean(b)
# [1] 0.1004483 # not the population mean, just a sample
# [1] 0.1173265
cor(a ,b)
# [1] -0.03908718

The incidental correlation is near 0 for these independent samples. Now normalize to mean of 0 and SD of 1.

a <- scale( a )
b <- scale( b )
cor(a, b)
# [1,] -0.03908718

Again, this is the exact same value even though the mean is 0 and SD = 1 for both a and b.

cor(a, a*b)
# [1,] -0.01038144

This is also very near 0. (a*b can be considered the interaction term)

However, usually the SD and mean of predictors differ quite a bit so let’s change b. Instead of taking a new sample I’ll rescale the original b to have a mean of 5 and SD of 2.

b <- b * 2 + 5
cor(a, b)
 # [1] -0.03908718

Again, that familiar correlation we’ve seen all along. The scaling is having no impact on the correlation between a and b. But!!

cor(a, a*b)
# [1,] 0.9290406

Now that will have a substantial correlation which you can make go away by centring and/or standardizing. I generally go with just the centring.

EDIT: @Tim has an answer here that’s a bit more directly on topic. I didn’t have Kruschke at the time. The correlation between intercept and slope is similar to the issue of correlation with interactions though. They’re both about conditional relationships. The intercept is conditional on the slope; but unlike an interaction it’s one way because the slope is not conditional on the intercept. Regardless, if the slope varies so will the intercept unless the mean of the predictor is 0. Standardizing or centring the predictor variables will minimize the effect of the intercept changing with the slope because the mean will be at 0 and therefore the regression line will pivot at the y-axis and it’s slope will have no effect on the intercept.

Attribution
Source : Link , Question Author : Community , Answer Author : John

Leave a Comment