/IMRAM

code for our CVPR2020 paper "IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval"

Primary LanguagePython

Requirements and Installation

We recommended the following dependencies.

import nltk
nltk.download()
> d punkt

Data preparation

Download the dataset files. We use splits produced by Andrej Karpathy. The raw images can be downloaded from from their original sources here, here and here.

The precomputed image features are extracted from the raw images using the bottom-up attention model from here. Image features for training set, validation set and testing set should be merged in order into one .npy file, respectively. More details about the image feature extraction can also be found in SCAN(https://github.com/kuanghuei/SCAN).

Data files can be found in SCAN (We use the same dataset split as theirs):

wget https://scanproject.blob.core.windows.net/scan-data/data_no_feature.zip

Place data_no_feature.zip in the directory of data.

Training and Evaluation

./script/tune_coco.sh