Which OLS assumptions are colliders violating?

The following webpage says that:

We should not control for a collider variable!

Which OLS assumptions are colliders violating?

Answer

I will assume models without intercepts to have shorter notation. Say the structural causal model is
\begin{aligned}
Y&=\beta_1X+u, \\
Z&=\gamma_1X+\gamma_2Y+v, \\
X&=w
\end{aligned}

with $u,v,w$ being mutually independent zero-mean exogenous structural errors so that $Z$ is a collider: $X\rightarrow Z\leftarrow Y$.

Let us specify a linear regression as
$$
Y=\alpha_1X+\alpha_2Z+\varepsilon
$$

and get ready to estimate it with OLS. We would wish for $\hat\alpha_1^{OLS}\rightarrow\beta_1$ as $n\rightarrow\infty$. This would be the case if the following two conditions held simultaneously:

  1. $\alpha_1=\beta_1$ and
  2. the relevant OLS assumptions were satisfied.

However, this is not the case. Suppose $\alpha_1=\beta_1$. Then from the structural causal model and the specified regression we get
\begin{aligned}
\varepsilon&=-\alpha_2Z+u \\
&=-\alpha_2(\gamma_1X+\gamma_2Y+v)+u.
\end{aligned}

Thus $\varepsilon$ is a linear function of $X$. This violates the assumption $\mathbb{E}(\varepsilon|X)=0$. This assumption is what Wooldridge calls Assumption MLR.4 (Zero Conditional Mean) in “Introductory Econometrics: A Modern Approach”. Note that it is specific to the desired causal interpretation of regression parameters; noncausal interpretations (such as regression as a model of the conditional expectation function of $Y|X,Z$) do not require it. Since it is violated, we cannot have both conditions above to hold simultaneously. Therefore, $\beta_1$ cannot be the target to which the OLS estimator of $\alpha_1$ converges.

Attribution
Source : Link , Question Author : robertspierre , Answer Author : Richard Hardy

Leave a Comment