MILVLG/mcan-vqa

improved results in val set but decrease results in online evaluation

trahman8 opened this issue · 1 comments

Thanks for providing codes of this interesting project. I followed your approach where I trained the network with default hyper-parameters settings (python3 run.py --RUN='train'). During validation the model is performing well where I used

python3 run.py --RUN='val' --CKPT_PATH=str

But when I did online evaluation performance is not improving. I used

python3 run.py --RUN='test' --CKPT_PATH=str

to generate json for online evaluation. Am I missing something? or am I using correct split? Or for online evaluation (i.e. on test-dev and test-std) do we need to train the network with different split?

The test-dev results on our paper use the train+val+vg as the training set, while the val results only use the train split. I guess this could be the problem