How to split a data set to do 10-fold cross validation

Now I have a R data frame (training), can anyone tell me how to randomly split this data set to do 10-fold cross validation?


caret has a function for this:

flds <- createFolds(y, k = 10, list = TRUE, returnTrain = FALSE)
names(flds)[1] <- "train"

Then each element of flds is a list of indexes for each dataset. If your dataset is called dat, then dat[flds$train,] gets you the training set, dat[ flds[[2]], ] gets you the second fold set, etc.

Source : Link , Question Author : user22062 , Answer Author : Ari B. Friedman

Leave a Comment