mlvlab/Flipped-VQA

Cannot reproduce the result

Closed this issue · 2 comments

When I tried to reproduce the experimental result of the STAR dataset on my machine with a single RTX4090 GPU using the checkpoint provided on the google drive, I found that the accuracy is just around 25%. And when I tried to train the model with the following parameters, the loss always goes to nan easily. I'm not sure what the problem is.
python3 train.py --qav --vaq --max_seq_len 128 --batch_size 1 \ --epochs 10 --bias 3 --tau 100. --max_feats 10 --warmup_epochs 2 \ --dataset star --blr 9e-2 --weight_decay 0.16 --output_dir ./checkpoint/star \ --accum_iter 8 --dataset star --model llama-2-7b --llama_model_path ./pretrained/ \ --num_workers 8
The batch_size is set to 1 since the VRAM issue.

Hello,
I am facing the issue as well.
Could you share with me how you solve the problem!?

llama2 does not work for the framework. Just download llama1 model from this issue (the huggingface link) and the model can run properly.