miccaiif/WENO

Question of reproducing the results on camelyon16

Closed this issue · 2 comments

Thx for your great work!

I followed the instructions written in the paper and its appendix to reproduce the results on camelyon16.
However, I observed that its training was hard to converge (the loss starts to decrease quickly at about 100-th epoch). Caused by this, possibly I guess, I also found that the bag-level AUC of teacher was only about 0.6, whereas the instance-level AUC of student was extremely high (0.94).

I am not sure whether I used the same hyper-parameters as the paper. So, could you provide the full hyper-parameters setting of training camelyon16?

Through experiments, I found that the issues above were largely due to

  • Batch size = 1
  • a simple SGD optimizer.

(as provided in the source code of this repo)

I changed them as follows.

  • Batch size = 4, realized by gradient accumulation.
  • Adam optimizer.

All things became more reasonable.

Thanks for your attention and your contribution!