Question about the scores?
Stephen-Adams opened this issue · 3 comments
I have ran your code, but got a higher score , And I guess if there are some mistake in my settings, could you help me? Thank you
For example: with vgg19 + s2vt without attention, I got:
"CIDEr": 0.381709195850067,
"Bleu_4": 0.35092030557193526,
"Bleu_3": 0.46800626456106637,
"Bleu_2": 0.6047642387263332,
"Bleu_1": 0.7574938986755618,
"ROUGE_L": 0.5712265574740849,
"METEOR": 0.25508078041867904
for the best.
But Actually, I didn't change anything important in your code.
I split the train_dataset downed from README to 6513/497/2990 for train/val/test.
And the training loss is here:
model_0, loss: 57.772758
model_10, loss: 44.913509
model_20, loss: 40.874763
model_30, loss: 40.119427
model_40, loss: 37.268291
model_50, loss: 33.424942
model_60, loss: 35.766853
model_70, loss: 34.876366
model_80, loss: 31.450918
model_90, loss: 29.820242
model_100, loss: 29.936274
model_110, loss: 30.059401
model_120, loss: 30.751385
model_130, loss: 28.711311
model_140, loss: 29.971272
model_150, loss: 30.382835
model_160, loss: 28.844414
model_170, loss: 26.373568
model_180, loss: 28.996819
model_190, loss: 27.722120
model_200, loss: 28.414360
model_210, loss: 25.155075
model_220, loss: 27.731709
model_230, loss: 28.479822
model_240, loss: 26.850664
model_250, loss: 26.169445
model_260, loss: 27.791225
model_270, loss: 25.879797
model_280, loss: 24.860294
model_290, loss: 24.067417
model_300, loss: 23.089293
model_310, loss: 24.369297
model_320, loss: 24.594177
model_330, loss: 24.342461
model_340, loss: 24.752075
model_350, loss: 25.322969
model_360, loss: 25.452364
model_370, loss: 22.378075
model_380, loss: 24.766953
model_390, loss: 22.536497
model_400, loss: 21.342590
I only trained the model for 400 epoch, because I find that model around 100 epoch performs better.
with "model_100", I got the best score as showed above.
I am new to this, and don't know what is wrong...
Wish for your help.
This is a normal result
This is a normal result
Really? But I find papers that compare their results to s2vt, where the score of s2vt is also lower than which gained by you code. Score shown in their paper maybe
"CIDEr": 0.351,
"Bleu_4": 0.326,
"ROUGE_L": 0.561,
"METEOR": 0.255
for example.
This is a normal result
Where CIDEr and Bleu_4 are obviously lower in those papers. 0.0