juaml/julearn

[BUG]: StratifiedBootstrap can give the same sample on train and test set

Opened this issue · 0 comments

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Here we can see when the random choice is made and then split into train/test.

bs_inds = np.random.choice(t_inds, len(t_inds), replace=True)
train.extend(bs_inds[:n_train])
test.extend(bs_inds[n_train:])

Expected Behavior

Basically, whatever gets chosen as test, should not be in the train.

This does not go with the Out of Bag Boostrap defitinion.

We should resample with repetition and whatever sample is not in the train set, is the test.

This can also allow us to implement the .632 and .632+ scoring correction methods.

Steps To Reproduce

latest julearn

Environment

not relevant

Relevant log output

No response

Anything else?

No response