COFAR

Official implementation of the COFAR: Commonsense and Factual Reasoning in Image Search (AACL-IJCNLP 2022 Paper)

Requirements

Use python >= 3.8.5. Conda recommended : https://docs.anaconda.com/anaconda/install/linux/
Use pytorch 1.9.0; CUDA 11.1

To setup environment

conda env create -n kmmt --file kmmt.yml
conda activate kmmt

Data

Images: link.

Image format: Readme.

Training and testing data can be downloaded from the "Dataset Downloads" section in this page.

Also, both oracle and wikified knowledge bases for all categories can be downloaded from the same link above.

Feature extraction

Image Feature extraction: Script.

Create folder train_obj_frcn_features/ inside data/cofar_{category}/ folder for corresponding categories and copy image features to this folder.

Training

MS-COCO - pretraining checkpoint can be downloaded from here.

Place the downloaded kmmt_pretrain_checkpoint.pt in working_checkpoints folder.

Respective config files are in config/ folder and are automatically loaded.

MLM finetuning

python main.py --do_train --mode mlm

ITM finetuning

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node 2 --nnodes 1 --node_rank 0 main.py --do_train --mode itm

Evaluation

Download our cofar finetuned checkpoint from here.

Copy the downloaded cofar_itm_final_checkpoint.pt to working_checkpoints/ folder.

python cofar_eval.py --category brand

Other settings can be changed from config/test_config.yaml

License

This code and data are released under the MIT license.

Cite

If you find this data/code/paper useful for your research, please consider citing.

@inproceedings{cofar2022,
  author    = "Gatti, Prajwal and 
              Penamakuri, Abhirama Subramanyam and
              Teotia, Revant and
              Mishra, Anand and
              Sengupta, Shubhashis and
              Ramnani, Roshni",
  title     = "COFAR: Commonsense and Factual Reasoning in Image Search",
  booktitle = "AACL-IJCNLP",
  year      = "2022",
}

vl2g/cofar