Which OLS assumptions are colliders violating?

The following webpage says that:

We should not control for a collider variable!

Which OLS assumptions are colliders violating?


I will assume models without intercepts to have shorter notation. Say the structural causal model is
Y&=\beta_1X+u, \\
Z&=\gamma_1X+\gamma_2Y+v, \\

with $u,v,w$ being mutually independent zero-mean exogenous structural errors so that $Z$ is a collider: $X\rightarrow Z\leftarrow Y$.

Let us specify a linear regression as

and get ready to estimate it with OLS. We would wish for $\hat\alpha_1^{OLS}\rightarrow\beta_1$ as $n\rightarrow\infty$. This would be the case if the following two conditions held simultaneously:

  1. $\alpha_1=\beta_1$ and
  2. the relevant OLS assumptions were satisfied.

However, this is not the case. Suppose $\alpha_1=\beta_1$. Then from the structural causal model and the specified regression we get
\varepsilon&=-\alpha_2Z+u \\

Thus $\varepsilon$ is a linear function of $X$. This violates the assumption $\mathbb{E}(\varepsilon|X)=0$. This assumption is what Wooldridge calls Assumption MLR.4 (Zero Conditional Mean) in “Introductory Econometrics: A Modern Approach”. Note that it is specific to the desired causal interpretation of regression parameters; noncausal interpretations (such as regression as a model of the conditional expectation function of $Y|X,Z$) do not require it. Since it is violated, we cannot have both conditions above to hold simultaneously. Therefore, $\beta_1$ cannot be the target to which the OLS estimator of $\alpha_1$ converges.

Source : Link , Question Author : robertspierre , Answer Author : Richard Hardy

Leave a Comment