Add random train-test splitting
Opened this issue · 1 comments
abbradar commented
It can be implemented via sample
family of functions from StatsBase
. Example implementation with sklearn
-like interface is here. If it's okay I can make a PR; what holds me from it is that I'm a newcomer and may have just missed an already existing and obvious way to do it.
EDIT: also a nice addition would be to support several arrays simultaneously -- I'll work on this if it's accepted to be useful.
bobbywlindsey commented
I think the sample
function in StatsBase
doesn't allow a user to specify through which dimensions to take a sample from. So in practice, it's only useful for 1-dimensional arrays.