Is there a clear set of conditions under which lasso, ridge, or elastic net solution paths are monotone?

The question What to conclude from this lasso plot (glmnet) demonstrates solution paths for the lasso estimator that are not monotonic. That is, some of the cofficients grow in absolute value before they shrink.

I’ve applied these models to several different kinds of data sets and never seen this behavior “in the wild,” and until today had assumed that they were always monotonic.

Is there a clear set of conditions under which the solution paths are guaranteed to be monotone? Does it affect the interpretation of the results if the paths change direction?

Answer

I can give you a sufficient condition for the path to be monotonic: an orthonormal design of X.

Suppose an orthonormal design matrix, that is, with p variables in X, we have that \frac{X’X}{n} = I_p. With an orthonormal design the OLS regression coefficients are simply \hat{\beta}^{ols} = \frac{X’y}{n}.

The Karush-Khun-Tucker conditions for the LASSO thus simplify to:


\frac{X’y}{n} = \hat{\beta}^{lasso} + \lambda s \implies \hat{\beta}^{ols} = \hat{\beta}^{lasso} + \lambda s

Where s is the sub gradient. Hence, for each j\in \{1, \dots, p\} we have that \hat{\beta}_j^{ols} = \hat{\beta}_j^{lasso} + \lambda s_j, and we have a closed form solution to the lasso estimates:


\hat{\beta}_j^{lasso} = sign\left(\hat{\beta}_j^{ols}\right)\left(|\hat{\beta}_j^{ols}| – \lambda \right)_{+}

Which is monotonic in \lambda. While this is not a necessary condition, we see that the non-monotonicity must come from the correlation of the covariates in X.

Attribution
Source : Link , Question Author : shadowtalker , Answer Author : Carlos Cinelli

Leave a Comment