Now I have a
R
data frame (training), can anyone tell me how to randomly split this data set to do 10-fold cross validation?
Answer
caret
has a function for this:
require(caret)
flds <- createFolds(y, k = 10, list = TRUE, returnTrain = FALSE)
names(flds)[1] <- "train"
Then each element of flds
is a list of indexes for each dataset. If your dataset is called dat
, then dat[flds$train,]
gets you the training set, dat[ flds[[2]], ]
gets you the second fold set, etc.
Attribution
Source : Link , Question Author : user22062 , Answer Author : Ari B. Friedman