zlsh80826/MSMARCO

Is it possible to share trained model?

abaheti95 opened this issue · 3 comments

Hi,
Since training a new model afresh will not guarantee same results (also takes a lot of time), will it be possible to share your trained model which gives about 38-38 BLEU? It will be very helpful for everyone in the community.

Hi,

Here is the pretrained model which have both 39.4 bleu-1 and rouge-l score on V1 validation dataset. https://drive.google.com/file/d/17vQl-xLbuuq5cbu-K9cHc8OQoQMmdkTG/view?usp=sharing
After downloading it, put elmo_embedding.bin and vocabs.pkl under "data" directory, and "model" directory on your MSMARCO root directory.

Wow! Thank you so much for sharing this!

Can you also briefly tell me how can I test this on another dataset. I am guessing you are using pre-trained elmo embeddings using elmo_embedding.bin. If that is the case how can I test this model on a different dataset, let's say SQuAD? I was just curious about its performance on SQuAD.

Sorry, this repo is designed for MSMARCO. You may modify convert_msmarco.py code if you want to use SQuAD dataset. Once you can process SQuAD data to tsv format (For each line you need query_id, query_type, passage, question, answer, answer_start, answer_end), I think there is no problem on follow steps.