clip-vil/CLIP-ViL

[ICLR 2022] code for "How Much Can CLIP Benefit Vision-and-Language Tasks?" https://arxiv.org/abs/2107.06383

PythonMIT

Issues

How to infer captions on my own images
#31 opened a year ago by victorup
1
How to reproduce the results of experiments that are shown in Table 7
#33 opened a year ago by raven38
1
Where can I found annotations for SNLI-VE?
#32 opened a year ago by 1219521375
1
Wrong visualization results of ViT-B/16 in gradcam_clip.ipynb
#35 opened a year ago by wangq95
0
No 'tfm_gen' when trying to run feature extraction for vqa mcan
#29 opened 2 years ago by fallcat
1
Checkpoint for GQA model
#12 opened 3 years ago by aurooj
1
pretrained weigths for VQA
#34 opened 2 years ago by guanhdrmq
0
Checkpoint for SNLI-VE
#26 opened 2 years ago by sramshetty
2
Pythia Feature Extraction
#28 opened 2 years ago by shamanthak-hegde
1
Data dir for mcan_clip_grid_feature.py
#24 opened 3 years ago by Fly2flies
1
Extracting image features using RN50 for Pythia
#27 opened 3 years ago by shamanthak-hegde
0
About precompute
#25 opened 3 years ago by StylesZhang
2
error in clip extraction code: precomute_imagenet_views.py
#23 opened 3 years ago by wangqian621
3
Errors occurred when extracting clip features using Resnet
#22 opened 3 years ago by tianjunyu0871
1
Errors occurred when extracting clip features using ViT-B/32
#21 opened 3 years ago by tianjunyu0871
1
Captioning model training script fails
#2 opened 4 years ago by j-min
11
The extracted feature of the COCO dataset for caption
#18 opened 3 years ago by liujiaheng
4
The clip_feature
#19 opened 3 years ago by Timon0327
1
CLIP-VIT-B-Transformer captioning results
#20 opened 3 years ago by YuanEZhou
1
About the training time of Pythia
#15 opened 3 years ago by tingxueronghua
2
Train with a single GPU
#16 opened 3 years ago by ruinianxu
1
The links to download CLIP features on the R2R/RxR environment are invalid.
#17 opened 3 years ago by chenguanqi
2
bug in positional_embedding's weights when resizing.
#13 opened 3 years ago by jianjieluo
2
Missed Link
#11 opened 3 years ago by jdiazram
1
configuration file for CLIP-Res50x4
#10 opened 4 years ago by itsyoavshalev
1
evaluating vqa using pythia
#9 opened 4 years ago by itsyoavshalev
1
Grad-CAM visualization code
#7 opened 4 years ago by yangbang18
1
Grad-CAM visualization code
#8 opened 4 years ago by yangbang18
0
Pretrained weights for image captioning
#6 opened 4 years ago by zhuang93
1
How to combine CLIP with Oscacr(or VinVL)?
#4 opened 4 years ago by 594422814
1
About clip feature extraction
#5 opened 4 years ago by LittleDonkey1203
1
Why weights of R-50-grid.yaml is commented out?
#3 opened 4 years ago by tshu-w
0
MS COCO Caption scores with MLE objective
#1 opened 4 years ago by j-min
1