how to prevent overfitting?

Question

how to prevent overfitting?

xuzhang5788 opened this issue 4 years ago · 4 comments

I saw your epochs=1000 and you didn't set early stopping. So, how do you prevent overfitting for such large epochs? Thanks.

In addition, I don't think you can use your test dataset to guide your training (training.py). Normally, when you train your model, you can't touch the test dataset. Otherwise, it is overfitting and can not have a generalized model. Maybe your training_validation.py is the right way to possibly have a generalized model.

In your paper, your results are better than DeepDTA and also WideDTA, I felt a little bit unsure if using a totally new test dataset.

Answer 1 · 2021-02-23T06:07:15.000Z

Yes. training_validation.py could help avoid overfit.

Answer 2 · 2021-02-23T08:07:57.000Z

I saw you also used your testing dataset to guide your training in your training_validation.py. So I think you have data leakage.

Answer 3 · 2021-02-23T08:17:22.000Z

Could you please show me where in https://github.com/thinng/GraphDTA/blob/master/training_validation.py
testing is used to guide training? Thanks!

Answer 4 · 2021-02-23T08:21:13.000Z

Sorry, I reread your code, it looks okay.