Can anybody point me to a survey paper on “Large p, Small n” results? I am interested in how this problem manifests itself in different research contexts, e.g. regression, classification, Hotelling’s test, etc.
I don’t know of a single paper, but I think the current book with the best survey of methods applicable to p≫n is still Friedman-Hastie-Tibshirani. It is very partial to shrinkage and lasso (I know from a common acquaintance that Vapnik was upset at the first edition of the book), but covers almost all common shrinkage methods and shows their connection to Boosting. Talking of Boosting, the survey of Buhlmann & Hothorn also shows the connection to shrinkage.
My impression is that, while classification and regression can be analyzed using the same theoretical framework, testing for high-dimensional data is different, since it’s not used in conjunction with model selection procedures, but rather focuses on family-wise error rates. Not so sure about the best surveys there. Brad Efron has a ton of papers/surveys/book on his page. Read them all and let me know the one I should really read…