consider providing training/test split info
Closed this issue · 0 comments
mcfrank commented
if we want to incentivize appropriate validation steps, we could consider adding a get_train_test_split
method that takes:
type
argument for"corpus"
,"child"
, or"token"
proportion
for how much in test (e.g., default 10%)
and returns a random filter for training and test split that can be passed to various other get_
methods.
can't tell if this is a good idea, but it might make it easier to do safe exploration + cross-validation...