amazon-science/sentence-representations

how to duplicate the results in paper?

beckybai opened this issue · 1 comments

Hi, I follow the suggested code and pre-trained model. But I cannot get the results listed in paper.

I change the batch size from 1024 to 256 and lower the lr_scale from 100 to 25. Is there any suggestion on how to change other hyper-parameters?

Thank you!

Please use the default hyperparameters listed in our paper. Especially the lr_scale should be set to 100, which promotes stable trainining and good performance. As for the batch size, contrastive learning benefits from large batch size. If you cannot train with batch size = 1024, then try 512 instead. Thanks