Why ANOVA/Regression results change when controlling for another variable

This question might be very basic, but somehow I don’t understand this point.

Suppose initially I used a univariate regression equation such as


I’ll get some coefficient values (say 0.5). Now, I’m using the same structure of the regression model, but added another independent variable. So, the new equation is


Then the new coefficients value will be b=0.3 & c=0.4.

My question is why coefficient’s value changes when we add another independent variable?

Hope I can put my question clearly.


Linear regression can be illustrated geometrically in terms of an orthogonal projection of the predicted variable vector $\boldsymbol{y}$ onto the space defined by the predictor vectors $\boldsymbol{x}_{i}$. This approach is nicely explained in Wicken’s book “The Geometry of Multivariate Statistics” (1994). Without loss of generality, assume centered variables. In the following diagrams, the length of a vector equals its standard deviation, and the cosine of the angle between two vectors equals their correlation (see here). The simple linear regression from $\boldsymbol{y}$ onto $\boldsymbol{x}$ then looks like this:

enter image description here

$\hat{\boldsymbol{y}} = b \cdot \boldsymbol{x}$ is the prediction that results from the orthogonal projection of $\boldsymbol{y}$ onto the subspace defined by $\boldsymbol{x}$. $b$ is the projection of $\boldsymbol{y}$ in subspace coordinates (basis vector $\boldsymbol{x}$). This prediction minimizes the error $\boldsymbol{e} = \boldsymbol{y} – \hat{\boldsymbol{y}}$, i.e., it finds the closest point to $\boldsymbol{y}$ in the subspace defined by $\boldsymbol{x}$ (recall that minimizing the error sum of squares means minimizing the variance of the error, i.e., its squared length). With two correlated predictors $\boldsymbol{x}_{1}$ and $\boldsymbol{x}_{2}$, the situation looks like this:

enter image description here

$\boldsymbol{y}$ is projected orthogonally onto $U$, the subspace (plane) spanned by $\boldsymbol{x}_{1}$ and $\boldsymbol{x}_{2}$. The prediction $\hat{\boldsymbol{y}} = b_{1} \cdot \boldsymbol{x}_{1} + b_{2} \cdot \boldsymbol{x}_{2}$ is this projection. $b_{1}$ and $b_{2}$ are thus the ends of the dotted lines, i.e. the coordinates of $\hat{\boldsymbol{y}}$ in subspace coordinates (basis vectors $\boldsymbol{x}_{1}$ and $\boldsymbol{x}_{2}$).

The next thing to realize is that the orthogonal projections of $\hat{\boldsymbol{y}}$ onto $\boldsymbol{x}_{1}$ and $\boldsymbol{x}_{2}$ are the same as the orthogonal projections of $\boldsymbol{y}$ itself onto $\boldsymbol{x}_{1}$ and $\boldsymbol{x}_{2}$.

enter image description here

This allows us to directly compare the regression weights from each simple regression with the regression weights from the multiple regression:

enter image description here

$\hat{\boldsymbol{y}}_{1}$ and $\hat{\boldsymbol{y}}_{2}$ are the predictions from the simple regressions $\boldsymbol{y}$ onto $\boldsymbol{x}_{1}$, and $\boldsymbol{y}$ onto $\boldsymbol{x}_{2}$. Their endpoints give the individual regression weights $b^{1} = \rho_{x_{1} y} \cdot \sigma_{y}$ and $b^{2} = \rho_{x_{2} y} \cdot \sigma_{y}$, where $\rho_{x_{1} y}$ is the correlation between $\boldsymbol{x}_{1}$ and $\boldsymbol{y}$, and $\sigma_{y}$ is the standard deviation of $\boldsymbol{y}$. In contrast, the endpoints of the dotted lines give the regression weights from the multiple regression of $\boldsymbol{y}$ onto $\boldsymbol{x}_{1}$ and $\boldsymbol{x}_{2}$: $b_{1} = \beta_{1} \sigma_{y}$, where $\beta_{1}$ is the standardized regression coefficient.

Now it is easy to see that $b^{1}$ and $b^{2}$ will coincide exactly with $b_{1}$ and $b_{2}$ only if $\boldsymbol{x}_{1}$ and $\boldsymbol{x}_{2}$ are orthogonal (or if $\boldsymbol{y}$ is orthogonal to the plane spanned by $\boldsymbol{x}_{1}$ and $\boldsymbol{x}_{2}$). It is also easy to geometrically construct cases that sometimes seem puzzling, e.g., when the regression weight has the opposite sign as the bivariate correlation between a predictor and the predicted variable:

enter image description here

Here, $\boldsymbol{x}_{1}$ and $\boldsymbol{x}_{2}$ are highly correlated. Now the sign of the correlation between $\boldsymbol{y}$ and $\boldsymbol{x}_{1}$ is positive (red line: orthogonal projection of $\boldsymbol{y}$ onto $\boldsymbol{x}_{1}$), but the regression weight from the multiple regression is negative (end of green line onto subspace defined by $\boldsymbol{x}_{1}$.

Source : Link , Question Author : Beta , Answer Author : caracal

Leave a Comment