# Which OLS assumptions are colliders violating?

The following webpage says that:

We should not control for a collider variable!

Which OLS assumptions are colliders violating?

I will assume models without intercepts to have shorter notation. Say the structural causal model is
\begin{aligned} Y&=\beta_1X+u, \\ Z&=\gamma_1X+\gamma_2Y+v, \\ X&=w \end{aligned}
with $$u,v,w$$ being mutually independent zero-mean exogenous structural errors so that $$Z$$ is a collider: $$X\rightarrow Z\leftarrow Y$$.

Let us specify a linear regression as
$$Y=\alpha_1X+\alpha_2Z+\varepsilon$$
and get ready to estimate it with OLS. We would wish for $$\hat\alpha_1^{OLS}\rightarrow\beta_1$$ as $$n\rightarrow\infty$$. This would be the case if the following two conditions held simultaneously:

1. $$\alpha_1=\beta_1$$ and
2. the relevant OLS assumptions were satisfied.

However, this is not the case. Suppose $$\alpha_1=\beta_1$$. Then from the structural causal model and the specified regression we get
\begin{aligned} \varepsilon&=-\alpha_2Z+u \\ &=-\alpha_2(\gamma_1X+\gamma_2Y+v)+u. \end{aligned}
Thus $$\varepsilon$$ is a linear function of $$X$$. This violates the assumption $$\mathbb{E}(\varepsilon|X)=0$$. This assumption is what Wooldridge calls Assumption MLR.4 (Zero Conditional Mean) in “Introductory Econometrics: A Modern Approach”. Note that it is specific to the desired causal interpretation of regression parameters; noncausal interpretations (such as regression as a model of the conditional expectation function of $$Y|X,Z$$) do not require it. Since it is violated, we cannot have both conditions above to hold simultaneously. Therefore, $$\beta_1$$ cannot be the target to which the OLS estimator of $$\alpha_1$$ converges.