/CPAIA

This is the MindSpore code implementation of an end-to-end cross-modal retrieval framework: a Multi-Task Consistent Preservation Adversarial Information Aggregation Network (CPAIA)

Primary LanguagePython

Multi-Task Consistent Preservation Adversarial Information Aggregation Network

CPAIA-model

Introduction

This framework is our proposed CPAIA, an end-to-end framework containing image and text sub-networks. There are three steps, the first step is feature extraction, where the feature vectors of different modalities are extracted by VGG and BOW respectively. The second step is representation separation, where the feature vectors are separated into mode-private and mode-shared components by means of a representation separation module (RS). The final step is a multi-task adversarial learning module (MA) to generate a discriminative common subspace.

Requirements

Install all required python dependencies:

pip install -r requirements.txt

Dataset

Wikipedia:

website link

Nuswide:

website link

XMedia

website

link:Unavailable (file only available for staff to apply, please contact the corresponding author of this article for assistance if needed)

Training

python main.py --dataset=xmedia --batchSize=64 --epoch=30 --device=GPU

Acknowledgement

This code is based on MindSpore.