Question of reproducing the results on camelyon16

Question

Question of reproducing the results on camelyon16

Closed this issue a year ago · 2 comments

Thx for your great work!

I followed the instructions written in the paper and its appendix to reproduce the results on camelyon16.
However, I observed that its training was hard to converge (the loss starts to decrease quickly at about 100-th epoch). Caused by this, possibly I guess, I also found that the bag-level AUC of teacher was only about 0.6, whereas the instance-level AUC of student was extremely high (0.94).

I am not sure whether I used the same hyper-parameters as the paper. So, could you provide the full hyper-parameters setting of training camelyon16?

Answer 1 · 2022-11-11T03:41:03.000Z

Through experiments, I found that the issues above were largely due to

Batch size = 1
a simple SGD optimizer.

(as provided in the source code of this repo)

I changed them as follows.

Batch size = 4, realized by gradient accumulation.
Adam optimizer.

All things became more reasonable.

Answer 2 · 2023-03-18T02:49:06.000Z

Thanks for your attention and your contribution!