carpentries-incubator/r-ml-tabular-data

Cross Validation

Opened this issue · 0 comments

Hi David and Carpentry Instructors,

I was going over the material again and had a question come up. When we are incorporating k-folding/cross validation, is there ever a time when too much k-folding/cross validation is bad? Besides it being computationally heavy, can it ever lead to decreased accuracy or precision? A balance between the appropriate number of observations per fold vs a lot of folds but with less observations in each fold?

A quick follow up to confirm my understanding, when we fold 10 times (nfold=10) does it create 9 random training sets and 1 random test set from you original training set? Then following all the model tuning with the randomly folded data we would then use the original test data frame correct?

Thanks for the solid course!
-Jake