This repo contains the source code and prepared datasets of the paper.
Follow the following steps to prepare the running environment.
- Clone and enter into the repo.
- Prepare folders for dataset and checkpoints.
$ mkdir dataset
$ mkdir checkpoints
- Create a conda virtual environment and install dependencies
$ conda create --name distill2 python=3.6
$ conda activate distill2
$ pip install -r requirements.txt
- Install pycocotools via https://github.com/cocodataset/cocoapi/tree/master/PythonAPI
- Download tokenizers for nltk
The method is tested on four datasets: Vgnome, BICIDR(This version is deprecated. Please goto this repo to find the newest version.), ChestXray, and COCO2014. COCO2014 can be organized following the offical settings, the other three datasets are prepared by us. For convenient experiments, download them through the given links and save them inside the datasets/
.
Text embedding using Glove
We have created embeddings for the datasets (only COCO and Vgnome use them), which can be accessed from coco_glove and vgnome_glove.
- Download these two embedded glove pickle files and copy to
dataset/
- (Optional) The following command line creates initial embedding matrix.
python data/create_initial_embedding.py
The code script requires vocabularies of the dataset (see the source code for more details), which we have provided in data/vocab_corpus
. In addition, glove.6B.300d.txt
need to be placed under dataset/
.
The following instruction works through experiments on COCO. All other launch scripts on the other datasets can be found at scripts/
- Train TandemNet2
device=0 sh scripts/coco/train_tandemnet2.sh
- Train TandemNet
device=0 sh scripts/coco/train_tandemnet.sh
- (Optional) Train ResNet101
device=0 sh scripts/coco/train_cnn.sh
- Test TandemNet2
device=0 sh scripts/coco/test_tandemnet2.sh
- Test TandemNet
device=0 sh scripts/coco/test_tandemnet.sh
- (Optional) Test ResNet101
device=0 sh scripts/coco/test_cnn.sh
Please consider to cite our papers
@article{zhang2019text,
title={Text-guided Neural Network Training for Image Recognition in Natural Scenes and Medicine},
author={Zhang, Zizhao and Chen, Pingjun and Shi, Xiaoshuang and Yang, Lin},
journal={IEEE transactions on pattern analysis and machine intelligence},
year={2019},
publisher={IEEE}
}
@inproceedings{Zhang2017TandemNet,
title={TandemNet: Distilling Knowledge from Medical Images Using Diagnostic Reports as Optional Semantic References},
author={Zhang, Zizhao and Chen, Pingjun and Sapkota, Manish and Yang, Lin},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)},
year={2017}
}