I have been wondering, why are LASSO and LARS model selection methods so popular even though they are basically just variations of step-wise forward selection (and thus suffer from path dependency)?
Similarly, why are General to Specific (GETS) methods for model selection mostly ignored, even though they do better than LARS/LASSO because they don’t suffer from the step-wise regression problem?
(basic reference for GETS: http://www.federalreserve.gov/pubs/ifdp/2005/838/ifdp838.pdf – newst algorithm in this starts with a broad model and tree search that avoids path dependency, and has been shown to often do better than LASSO/LARS).
It just seems strange, LARS/LASSO seem to get so much more exposure and citations than General to Specific (GETS), anyone have any thoughts?
Not trying to start a heated debate, more looking for a rational explanation for why the literature does seem to focus on LASSO/LARS rather than GETS and few people actually point out shortcomings of LASSO/LARS.
Disclaimer: I am only remotely familiar with the work on model selection by David F. Hendry among others. I know, however, from respected colleagues that Hendry has done very interesting progress on model selection problems within econometrics. To judge whether the statistical literature is not paying enough attention to his work on model selection would require a lot more work for my part.
It is, however, interesting to try to understand why one method or idea generates much more activity than others. No doubt that there are aspects of fashion in science too. As I see it, lasso (and friends) has one major advantage of being the solution of a very easily expressed optimization problem. This is key to the detailed theoretical understanding of the solution and the efficient algorithms developed. The recent book, Statistics for High-Dimensional Data by Bühlmann and Van De Geer, illustrates how much is already known about lasso.
You can do endless simulation studies and you can, of course, apply the methods you find most relevant and suitable for a particular application, but for parts of the statistical literature substantial theoretical results must also be obtained. That lasso has generated a lot of activity reflects that there are theoretical questions that can actually be approached and they have interesting solutions.
Another point is that lasso or variations do perform well in many cases. I am simply not convinced that it is correct that lasso is so easily outperformed by other methods as the OP suggests. Maybe in terms of (artificial) model selection but not in terms of predictive performance. None of the references mentioned seem to really compare Gets and lasso either.