The overall framework of Context-Aware Graph.
This is a PyTorch implementation for Iterative Context-Aware Graph Inference for Visual Dialog, CVPR2020.
If you use this code in your research, please consider citing:
@InProceedings{Guo_2020_CVPR,
author = {Guo, Dan and Wang, Hui and Zhang, Hanwang and Zha, Zheng-Jun and Wang, Meng},
title = {Iterative Context-Aware Graph Inference for Visual Dialog},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}
This code is implemented using PyTorch v0.3.1, and provides out of the box support with CUDA 9 and CuDNN 7.
- Download the VisDial v1.0 dialog json files and images from here.
- Download the word counts file for VisDial v1.0 train split from here.
- Use Faster-RCNN to extract image features from here.
- Download pre-trained GloVe word vectors from here.
Train the CAG model as:
python train/train_D_1.0.py --CUDA
Evaluation of a trained model checkpoint can be done as follows:
python eval/evaluate.py --model_path [path_to_root]/save/XXXXX.pth --cuda
This will generate an EvalAI submission file, and you can submit the json file to online evaluation server to get the result on v1.0 test-std.
Model | NDCG | MRR | R@1 | R@5 | R@10 | Mean |
---|---|---|---|---|---|---|
CAG | 56.64 | 63.49 | 49.85 | 80.63 | 90.15 | 4.11 |
- This code began with jiasenlu/visDial.pytorch. We thank the developers for doing most of the heavy-lifting.