Cross validation with two parameters: elastic net case

I want to know the cross validation procedure to find the two parameters of elastic net presented by Zou and Hastie on the prostate dataset as example.
I can’t improve the error rate lasso with k-fold when I use elastic net.


The method to use in this case is exactly the same, though e.g. the glmnet package doesn’t provide it out of the box.

Instead of working over 1 discrete set of parameter values (lambda), you now crossvalidate for a grid of parameter values, (lambda and alpha), then pick the best value (lambda.min and alpha.min), and then the lambda and alpha so that lambda is the biggest possible but its predictive measure is within 1 SE of that of lambda.min and alpha.min.

If you use R, you can probably do something like:

alphasOfInterest<-seq(0,1,by=0.1) #or something similar
#step 1: do all crossvalidations for each alpha
cvs<-lapply(alphasOfInterest, function(curAlpha){
  cv.glmnet(myX, myY, alpha=curAlpha, some more parameters)
#step 2: collect the optimum lambda for each alpha
optimumPerAlpha<-sapply(seq_along(alphasOfInterest), function(curi){
  indOfMin<-match(curcvs$lambda.min, curcvs$lambda)
  c(lam=curcvs$lambda.min, alph=curAlpha, cvup=curcvs$cvup[indOfMin])
#step 3: find the overall optimum
#step 4: now check for each alpha which lambda is the best within the threshold
corrected1se<-sapply(seq_along(alphasOfInterest), function(curi){
  lams[curcvs$cvm > overall.criterionthreshold]<-NA
  lam1se<-max(lams, na.rm=TRUE)
  c(lam=lam1se, alph=alphasOfInterest[curi])
#step 5: find the best (lowest) of these lambdas
overall.lambda.1se<-max(corrected1se["lam", ])
pos<-match(overall.lambda.1se, corrected1se["lam", ])
overall.alpha.&se<-corrected1se["alph", pos]

All this code is untested + needs attention if you use auc as your criterion (because then you need to look for the maximum of the criterion and some other details change), but the ideas are there.

Note: in the last step, you could, instead of going for the highest lambda, find the one that has the most parsimonious model (because higher lambda does not guarantee more parsimony over different alphas)

You may also want to collect all lambdas up front, and pass the collection of all those to every crossvalidation, so that you can ensure that each crossvalidation uses the same set of lambdas. This is easy to do but requires some extra steps. I’m not certain whether it is necessary…

Source : Link , Question Author : grant , Answer Author : Nick Sabbe

Leave a Comment