/saem

Learning Fragment Self-Attention Embeddings for Image-Text Matching, in ACM MM 2019

Primary LanguagePython

Introduction

This is the source code of Learning Fragment Self-Atention Embeddings for Image-Text Matching, ACM MM 2019.

Requirements

  • python 3.6
  • pytorch 0.4.1

Download data

We use the precomputed image features provided by SCAN. Please download data.zip from SCAN.

Bert model

We use the bert code from BERT-pytorch. Please following here to convert the Google bert model to a PyTorch save file.

Training

python train.py --data_path /path/to/data --data_name f30k_precomp --bert_path /path/to/uncased_L-12_H-768_A-12/
python train.py --data_path /path/to/data --data_name coco_precomp --bert_path /path/to/uncased_L-12_H-768_A-12/