LalehSeyyed/CheXclusion

data split

Closed this issue · 7 comments

Hi,

Thanks for the code, it's very helpful.
I was wondering is which script you split the data into train, validation, and test? I noticed that in MIMIC-CXR-JPG there is already an official csv called data_split, but all the data are labeled as "train" in that csv.

Thank you!

best,
Zixiao

Hi Laleh,

Thank you very much for the quick reply. Yes I agree and I noticed the same problem, the training dataset is more than 97%.
I was wondering in which script did you put the code of data splitting? I couldn't find it.
Thank you!

Best,
Zixiao

That's very helpful, thank you Laleh!

Hi Laleh,

I was wonering what's the input of the random_split() function? Is it a ratio (such as 0.8 if the train set is 80)?
Thank you!

Thank you! That's very helpful