try split dataset by self
aguang1201 opened this issue · 3 comments
version 0.3.0 is very good.Congratulations!
I want to share one of the problems I encountered.
I setted the config as:
I increase the training dataset count.I think it can improve the mean AUC in result,but it's not.
The result is:mean auroc: 0.7680469487585274.Less than default setting result.
I do not understand why.Would you tell me why default split worked better?
And how to set the number of train_patient_count for improving the AUC.
Thank you for your nice work.
@aguang1201 use_default_split option is deprecated in 0.3.0. please specify your own dataset split by using the new option dataset_csv_dir. Please check the sample.config.ini for the detail. I make this decision because many people find it confusing. I will update the source code to alarm people who use these deprecated options. Thank you.
Thank you for your reply.
Unfortunately, it was deleted.I think splitting dataset is very useful.
But it does not matter, I can add it in my code.
I just want to look into why I increased the training data, AUC but lower.
Is the amount of train data is enough,Or the dev data is not enough?
This is really hard to figure out.
In addition:Have you tried NASNET?
@aguang1201 Couldn't understand you. Since this issue is resolved, I will close it. You could send me email if you have adhoc questions.