Tuned hyperparameters for end-to-end experiment for the collective dual encoder

Question

Tuned hyperparameters for end-to-end experiment for the collective dual encoder

Closed this issue 3 years ago · 1 comments

I was hoping to train a model to reproduce the results of the end-to-end experiment for the collective dual encoder without having to do the hyperparameter tuning. Would you mind providing the tuned training hyperparameters if they are different from your CLI defaults, e.g. epoch number, gradient accumulation steps, stopping criterion, as well as the final inference parameters, e.g. gamma?

Answer 1 · 2021-11-18T00:41:19.000Z

Most of the CLI defaults are the. final hyper-parameters except for gamma and num_train_epoch.
num_train_epoch=10 to 20 depending on the dataset.
We tuned gamma on the dev set. γ= 0.6 for BC5CDR and γ= 0.5 for MedMentions produced the best results for us.
We didn't have enough computing resources to do a hyperparameter search. We only tried different learning rates and different γ. Tuning other hyperparameters might improve the results.