We have a simple linear regression model. Our assumptions are:

$Y_i =\beta_0+\beta_1X_i+ \varepsilon_i $, $i=1, \cdots, n$

$\varepsilon_i \sim N(0, \sigma^2)$

$Var(\varepsilon_i|X_i=x)=\sigma^2$

$\varepsilon_1, \cdots, \varepsilon_n$ are mutually independent.

$\\$

Are these hypothesis enough to claim that $\varepsilon_i|X_i=x \sim N(0, \sigma^2)$?

**Answer**

No. Here’s an interesting counterexample.

Define a density function

$$g(x) = \frac{2}{\sqrt{2\pi}}\exp(-x^2/2)I(-t \le x \le 0 \text{ or } t \le x)$$

for $t = \sqrt{2\log(2)} \approx 1.17741$. ($I$ is the indicator function.)

The plot of $g$ is shown here in blue. If we define $h(x) = g(-x)$, its plot appears in red.

Direct calculation shows that any variable $Y$ with density $g$ has zero mean and unit variance. By construction, an equal mixture of $Y$ with $-Y$ (whose PDF is $h$) has a density function proportional to $\exp(-x^2/2)$: that is, it is standard Normal (with zero mean and unit variance).

Let $X_i$ have a Bernoulli$(1/2)$ distribution. Suppose $\varepsilon_i|X=0$ has density $g$ and $\varepsilon_i|X=1$ has density $h$, with all the $(X_i, \varepsilon_i)$ independent. The assumption about $Y_i$ is irrelevant (or true by definition of $Y_i$) and all the other assumptions hold by construction, yet *none* of the conditional distributions $\varepsilon_i | X_i = x$ are Normal for *any* value of $x$.

*These plots show a dataset of $300$ samples from a bivariate distribution where $E[Y|X]=5 + X.$ The $x$ values in the scatterplot at the left have been horizontally jittered (displaced randomly) to resolve overlaps. The dotted red line is the least squares fit to these data. The three histograms show the conditional residuals–which are expected to follow $g$ and $h$ closely–and then the combined residuals, which are expected to be approximately Normal.*

**Attribution***Source : Link , Question Author : DGRasines , Answer Author : whuber*