help: what's the meaning of "r_loss, f_loss, pseudo_labels, wra_loss" in "run_retrieval.py"?
Ammexm opened this issue · 1 comments
Ammexm commented
Junction4Nako commented
args.use_phrase is a trial argument in my experiment to see if WPG can improve the fine-tuning of image-text retrieval, which gives a negative result. I think you should not use it.
r_loss is the VSC loss similar to ALBEF and CLIP, which is applied on the uni-modal global embeddings;
f_loss is the ITM loss, which is applied on the multi-modal outputs;
pseudo_labels is the labels of sampled positive and negative pairs use in ITM, where those hard-negative pairs are sampled from the similarity distribution in VSC.
wra_loss is the WPG loss, the same as in the pre-training