# If p > n, the lasso selects at most n variables

One of the motivations for the elastic net was the following limitation of LASSO:

In the $p > n$ case, the lasso selects at most n variables before it
saturates, because of the nature of the convex optimization problem.
This seems to be a limiting feature for a variable selection method.
Moreover, the lasso is not well defined unless the bound on the
L1-norm of the coefficients is smaller than a certain value.

I understand that LASSO is a quadratic programming problem but also can be solved via LARS or element-wise gradient descent. But I do not understand where in these algorithms I encounter a problem if $p > n$ where $p$ is the number of predictors and $n$ is the sample size. And why is this problem solved using elastic net where I augment the problem to $p+n$ variables which clearly exceeds $p$.

As said, this is not a property of an algorithm but of the optimization problem. The KKT conditions basically give that for coefficient $\beta_j$ to be non-zero it has to correspond to a fixed correlation with the residual $|X_j^t(y-X\beta)| = \lambda$ ($\lambda$ is the regularization parameter).
After resolving the various complications with absolute value etc, you are left with a linear equation for each non-zero coefficient. Since the rank of the matrix $X$ is at most $n$ when $p>n$, this is the number of equations that can be solved, and therefore there are at most n non-zeros (unless there are redundancies).
By the way, this is true for any loss function, not only the standard lasso with $L_2$ loss. So it is in fact a property of the lasso penalty. There are many papers that show this KKT view and the resulting conclusions, I can point to our paper: