/DivE

Repository of "Improving Cross-Modal Retrieval With Set of Diverse Embeddings" (CVPR'23, Highlight)

Primary LanguagePython

Improving Cross-Modal Retrieval with Set of Diverse Embeddings

arXiv

This repository contains the official source code for our paper:

Improving Cross-Modal Retrieval with Set of Diverse Embeddings
Dongwon Kim, Namyup Kim, and Suha Kwak
POSTECH CSE
CVPR (Highlight), Vancouver, 2023.

Acknowledgement

Parts of our codes are adopted from the following repositories.

Dataset

data 
├─ coco_download.sh  
├─ coco # can be downloaded with the coco_download.sh 
│  ├─ images
│  │  └─ ......
│  └─ annotations 
│     └─ ......
├─ coco_butd
│  └─ precomp  
│     ├─ train_ids.txt
│     ├─ train_caps.txt
│     └─ ......   
├─ f30k 
│  ├─ images
│  │  └─ ......
│  ├─ dataset_flickr30k.json
│  └─ ......  
└─ f30k_butd
   └─ precomp  
      ├─ train_ids.txt
      ├─ train_caps.txt
      └─ ......

vocab # included in this repo
├─ coco_butd_vocab.pkl
└─ ......

Note: Downloaded datasets should be placed according to the directory structure presented above.

Requirements

You can install requirements using conda.

conda create --name <env> --file requirements.txt

Training on COCO

sh train_eval_coco.sh