In linear regression, we make the following assumptions

The mean of the response,

E(Yi), at each set of values of the predictors, (x1i,x2i,…), is a Linear function of the predictors.The errors, ε_i, are Independent. The errors, ε_i, at each set of values of the predictors, (x_{1i}, x_{2i},…), are Normally distributed. The errors, ε_i, at each set of values of the predictors, (x_{1i}, x_{2i},…), have Equal variances (denoted σ2). One of the ways we can solve linear regression is through normal equations, which we can write as

\theta = (X^TX)^{-1}X^TY

From a mathematical standpoint, the above equation only needs X^TX to be invertible. So, why do we need these assumptions? I asked a few colleagues and they mentioned that it is to get good results and normal equations are an algorithm to achieve that. But in that case, how do these assumptions help? How does upholding them help in getting a better model?

**Answer**

You are correct – you do not need to satisfy these assumptions to fit a least squares line to the points. You need these assumptions to interpret the results. For example, assuming there was no relationship between an input X_1 and Y, what is the probability of getting a coefficient \beta_1 at least as great as what we saw from the regression?

**Attribution***Source : Link , Question Author : Clock Slave , Answer Author : rinspy*