# What is an example of perfect multicollinearity?

What is an example of perfect collinearity in terms of the design matrix $X$?

I would like an example where $\hat \beta = (X'X)^{-1}X'Y$ can’t be estimated because $(X'X)$ is not invertible.

Here is an example with 3 variables, $$yy$$, $$x1x_1$$ and $$x2x_2$$, related by the equation

$$y=x1+x2+ε y = x_1 + x_2 + \varepsilon$$

where $$ε∼N(0,1)\varepsilon \sim N(0,1)$$

The particular data are

         y x1 x2
1 4.520866  1  2
2 6.849811  2  4
3 6.539804  3  6


So it is evident that $$x2x_2$$ is a multiple of $$x1x_1$$ hence we have perfect collinearity.

We can write the model as

$$Y=Xβ+ε Y = X \beta + \varepsilon$$

where:

$$Y=[4.526.856.54] Y = \begin{bmatrix}4.52 \\6.85 \\6.54\end{bmatrix}$$

$$X=[112124136] X = \begin{bmatrix}1 & 1 & 2\\1 & 2 & 4 \\1 & 3 & 6\end{bmatrix}$$

So we have

$$XX′=[112124136][111123246]=[61116112131163146] XX' = \begin{bmatrix}1 & 1 & 2\\1 & 2 & 4 \\1 & 3 & 6\end{bmatrix} \begin{bmatrix}1 & 1 & 1\\1 & 2 & 3 \\2 & 4 & 6\end{bmatrix} = \begin{bmatrix}6 & 11 & 16\\11 & 21 & 31 \\16 & 31 & 46\end{bmatrix}$$

Now we calculate the determinant of $$XX′XX'$$ :

$$det \det XX' = 6\begin{vmatrix}21 & 31 \\31 & 46\end{vmatrix} - 11 \begin{vmatrix}11 & 31 \\16 & 46\end{vmatrix} + 16\begin{vmatrix}11 & 21 \\16 & 31\end{vmatrix}= 0$$

In R we can show this as follows:

> x1 <- c(1,2,3)


create x2, a multiple of x1

> x2 <- x1*2


create y, a linear combination of x1, x2 and some randomness

> y <- x1 + x2 + rnorm(3,0,1)


observe that

> summary(m0 <- lm(y~x1+x2))


fails to estimate a value for the x2 coefficient:

Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept)   3.9512     1.6457   2.401    0.251
x1            1.0095     0.7618   1.325    0.412
x2                NA         NA      NA       NA

Residual standard error: 0.02583 on 1 degrees of freedom
Multiple R-squared:      1,     Adjusted R-squared:  0.9999
F-statistic: 2.981e+04 on 1 and 1 DF,  p-value: 0.003687


The model matrix $$XX$$ is:

> (X <- model.matrix(m0))

(Intercept) x1 x2
1           1  1  2
2           1  2  4
3           1  3  6


So $$XX’XX'$$ is

> (XXdash <- X %*% t(X))
1  2  3
1  6 11 16
2 11 21 31
3 16 31 46


which is not invertible, as shown by

> solve(XXdash)
Error in solve.default(XXdash) :
Lapack routine dgesv: system is exactly singular: U[3,3] = 0


Or:

> det(XXdash)
[1] 0