result reproduce for msr-vtt dataset

Question

result reproduce for msr-vtt dataset

Closed this issue 3 years ago · 9 comments

Hi, Ganchao!
i have difficulty in reprodcing the experiment results for msr-vtt.
i have executed the project on msr-vtt several times and always got unideal results.
the cider scores just fluctuate from 45 to 46.5 which is far from the results i.e. 49.6 reported in the paper.
would it be convenient for u to share the random seed values set in ur experiments for msr-vtt with me?
training on msr-vtt is too time-consuming, 6 days or so when using a single gpu.
looking forward to ur help, thanks!

Answer 1 · 2021-05-14T07:46:09.000Z

Sorry, I did not set a fixed seed when I trained my model. But I believe I never get a result from 45 to 46.5, that is too low. Can you show me your tensorboard logs. At2021-05-13 ***@***.***: Hi, Ganchao! i have difficulty in reprodcing the experiment results for msr-vtt. i have executed the project on msr-vtt several times and always got unideal results. the cider scores just fluctuate from 45 to 46.5 which is far from the results i.e. 49.6 reported in the paper. would it be convenient for u to share the random seed values set in ur experiments for msr-vtt with me? training on msr-vtt is too time-consuming, 6 days or so when using a single gpu. looking forward to ur help, thanks! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Answer 2 · 2021-05-14T10:57:18.000Z

one of the training tensorboard logs is as follows:

as illustrated in the picture, the light blue line represents one of the training records on msr-vtt dataset.
the output results :
BEST CIDEr(beam size = 2):
Bleu_1: 78.28
Bleu_2: 64.51
Bleu_3: 51.30
Bleu_4: 39.55
METEOR: 27.52
ROUGE_L: 59.84
CIDEr: 46.22
and the deep blue line is the training record that i executed the project again on msr-vtt without random seed settings yesterday
the later might be better than the former.

Answer 3 · 2021-05-16T13:15:21.000Z

Hi! Does the latter one finishes training?

Answer 4 · 2021-05-16T13:57:07.000Z

hi Ganchao!
thanks for ur attention to the training process.
the training is not over yet.
it will take 2 more days to finish all epochs.
so far, its logs is as follows:

it seems like better than before.
could i stop training and evaluate it now? is it appropriate to do so?

Answer 5 · 2021-05-19T01:16:07.000Z

hi, Ganchao!
the result of the latter experiment is as follows:
BEST CIDEr(beam size = 2):
Bleu_1: 79.03
Bleu_2: 65.13
Bleu_3: 51.72
Bleu_4: 40.05
METEOR: 27.88
ROUGE_L: 60.12
CIDEr: 46.59
BEST METEOR(beam size = 2):
Bleu_1: 79.63
Bleu_2: 65.68
Bleu_3: 52.00
Bleu_4: 40.06
METEOR: 28.07
ROUGE_L: 60.50
CIDEr: 47.36
and the logs screenshot is as follows:

while the cider value is further improved in best-meteor epoch, there is still a gap to achieve the 49.6.
looking forward to your insights on the experiment, thanks!

Answer 6 · 2021-05-20T11:01:02.000Z

What is the training batch size in your experiments? For msr-vtt, the best result I got is in the setting:--learning_rate=1e-4 --learning_rate_decay --learning_rate_decay_every=5 --learning_rate_decay_rate=3 --hidden_size=1300 --train_batch_size=48.
The results are as follows:(save 8 times for one epoch here)

BEST CIDEr(beam size = 2):
Bleu_1: 80.51
Bleu_2: 67.49
Bleu_3: 54.40
Bleu_4: 42.54
METEOR: 28.43
ROUGE_L: 61.62
CIDEr: 49.60

Answer 7 · 2021-05-20T12:29:41.000Z

hi, Ganchao!
the training batch size set in all of my experiments is 8 due to gpu memory limitation.
so the key is the model performce will be affected by distinct batch size settings.
and would it be convenient for u to share the batch size settings for msvd dataset?
thanks for ur sincere sharing and generous help!

Answer 8 · 2021-05-24T06:30:10.000Z

We set batch size to 32 for MSVD

Answer 9 · 2021-05-24T07:13:33.000Z

hi, Ganchao~
i got it.
thanks again!