As far as I understand, $R^2$ explains how well the model predicts the observation. Adjusted $R^2$ is the one that takes into account more observations (or degrees of freedom). So, Adjusted $R^2$ predicts the model better? Then why is this less than $R^2$? It appears it should often be more.

**Answer**

$R^2$ shows the linear relationship between the independent variables and the dependent variable. It is defined as $1-\frac{SSE}{SSTO}$ which is the sum of squared errors divided by the total sum of squares. $SSTO = SSE + SSR$ which are the total error and total sum of the regression squares. As independent variables are added $SSR$ will continue to rise (and since $SSTO$ is fixed) $SSE$ will go down and $R^2$ will continually rise irrespective of how valuable the variables you added are.

The Adjusted $R^2$ is attempting to account for statistical shrinkage. Models with tons of predictors tend to perform better in sample than when tested out of sample. The adjusted $R^2$ “penalizes” you for adding the extra predictor variables that don’t improve the existing model. It can be helpful in model selection. Adjusted $R^2$ will equal $R^2$ for one predictor variable. As you add variables, it will be smaller than $R^2$.

**Attribution***Source : Link , Question Author : user59756 , Answer Author : Sycorax*