The following webpage says that:

We should not control for a collider variable!

Which OLS assumptions are colliders violating?

**Answer**

I will assume models without intercepts to have shorter notation. Say the structural causal model is

\begin{aligned}

Y&=\beta_1X+u, \\

Z&=\gamma_1X+\gamma_2Y+v, \\

X&=w

\end{aligned}

with $u,v,w$ being mutually independent zero-mean exogenous structural errors so that $Z$ is a collider: $X\rightarrow Z\leftarrow Y$.

Let us specify a linear regression as

$$

Y=\alpha_1X+\alpha_2Z+\varepsilon

$$

and get ready to estimate it with OLS. We would wish for $\hat\alpha_1^{OLS}\rightarrow\beta_1$ as $n\rightarrow\infty$. This would be the case if the following two conditions held simultaneously:

- $\alpha_1=\beta_1$ and
- the relevant OLS assumptions were satisfied.

However, this is not the case. Suppose $\alpha_1=\beta_1$. Then from the structural causal model and the specified regression we get

\begin{aligned}

\varepsilon&=-\alpha_2Z+u \\

&=-\alpha_2(\gamma_1X+\gamma_2Y+v)+u.

\end{aligned}

Thus $\varepsilon$ is a linear function of $X$. This violates the assumption $\mathbb{E}(\varepsilon|X)=0$. This assumption is what Wooldridge calls Assumption MLR.4 (Zero Conditional Mean) in “Introductory Econometrics: A Modern Approach”. Note that it is specific to the desired causal interpretation of regression parameters; noncausal interpretations (such as regression as a model of the conditional expectation function of $Y|X,Z$) do not require it. Since it is violated, we cannot have both conditions above to hold simultaneously. Therefore, $\beta_1$ cannot be the target to which the OLS estimator of $\alpha_1$ converges.

**Attribution***Source : Link , Question Author : robertspierre , Answer Author : Richard Hardy*