mertyg/vision-language-models-are-bows

Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR 2023

PythonMIT

Issues

Does NegCLIP now support mulit-gpu and grad_accumulation？
#39 opened 3 months ago by Vicent0205
0
A question on results from Figure 2 and bag-or-wordness
#38 opened 6 months ago by iburenko
0
Evaluation bug when using GELU vs QuickGELU -- changes the results for some benchmarks
#35 opened 9 months ago by bryant1410
1
Why does `visual_genome_relation.json` still contain symmetric relations?
#37 opened 6 months ago by lcxrocks
4
about the performance of originial CLIP
#32 opened 7 months ago by hiker-lw
16
Question regarding numbers in Figure 1
#36 opened 7 months ago by YunYunY
1
Questions on evaluation results
#33 opened 9 months ago by ytaek-oh
1
Could you share the fine-tuned fiber checkpoint with EqSim constraints?
#34 opened 9 months ago by ytaek-oh
1
Exact hyperparameters for NegCLIP training. and question about imagenet accuracy reported in the paper
#4 opened a year ago by HarmanDotpy
20
How you run expreriments with batch size 1024 on a Single RTX-2080Ti
#31 opened a year ago by wujianP
1
I cannot run on RTX 3060 with batch-size=256!
#30 opened a year ago by shuguang99
2
train negCLIP result problem
#27 opened a year ago by haoshuai714
10
I can't reproduce Table 6
#29 opened a year ago by shuguang99
2
parameter file problem
#28 opened a year ago by haoshuai714
2
Similarity scores for NegCLIP are pretty similar
#26 opened a year ago by kochsebastian
0
dataset size of flickr and coco order datasets
#20 opened a year ago by HarmanDotpy
4
Model weights of regular COCO finetuning.
#25 opened a year ago by wildphoton
0
Flava image preprocessing
#24 opened a year ago by DianeBouchacourt
8
slow evaluation for xvlm
#23 opened a year ago by lezhang7
1
Projections W_i and W_t
#22 opened a year ago by DianeBouchacourt
7
Where to find the training data of NegCLIP?
#21 opened a year ago by Wyattwwwww
2
Calling model.eval() when computing scores otherwise non-deterministic results (torch._no_grad_() is not enough)
#17 opened a year ago by DianeBouchacourt
1
Table 6, COCO and Flickr Image/Text R@1 results
#18 opened a year ago by HarmanDotpy
8
matrix size for contrastive learning in model training
#19 opened a year ago by Lycus99
5
Requirements (e.g. torch versions)
#16 opened a year ago by DianeBouchacourt
4
Models are not in eval() mode.
#7 opened 2 years ago by linzhiqiu
1
Questions on BLIP score computation
#15 opened a year ago by DianeBouchacourt
3
mismatching results on compositional task
#9 opened a year ago by lezhang7
25
Code for generating negatives for training NegCLIP
#13 opened a year ago by HarmanDotpy
2
Concrete benchmark results of attributes understanding
#10 opened a year ago by Yangyi-Chen
5
eval coco order and flickr order
#11 opened a year ago by lezhang7
3
question about VG-Relation categories
#8 opened 2 years ago by hiker-lw
4
Thanks for your great work! When are you planning to share the code?
#1 opened 2 years ago by mu-cai
6
Will the CoCo-order and Flickr-order dataset be released?
#6 opened 2 years ago by linzhiqiu
2
why concat df to all_df?
#5 opened 2 years ago by lezhang7
1
Can you release the dataset during the finetuning of NegCLIP? Thanks!
#3 opened 2 years ago by mu-cai
5
When can you provide code and dataset？
#2 opened 2 years ago by BigHyf
5