JuliaStats/MLBase.jl

Add random train-test splitting

Opened this issue · 1 comments

It can be implemented via sample family of functions from StatsBase. Example implementation with sklearn-like interface is here. If it's okay I can make a PR; what holds me from it is that I'm a newcomer and may have just missed an already existing and obvious way to do it.

EDIT: also a nice addition would be to support several arrays simultaneously -- I'll work on this if it's accepted to be useful.

I think the sample function in StatsBase doesn't allow a user to specify through which dimensions to take a sample from. So in practice, it's only useful for 1-dimensional arrays.