/SEMA

SEMA: Semantic Distance Adversarial Learning for Text-to-Image Synthesis (TMM' 23)

Primary LanguagePythonMIT LicenseMIT

Python 3.8 Packagist

SEMA: Semantic Distance Adversarial Learning for Text-to-Image Synthesis (TMM' 23)

Official Pytorch implementation for our paper Semantic Distance Adversarial Learning for Text-to-Image Synthesis


Requirements

  • python 3.8
  • Pytorch 1.9
  • transformers 4.8.1

Installation

Clone this repo.

git clone https://github.com/yuanrr/SEMA

conda create -n SEMA
conda activate SEMA
pip install -r requirements.txt

Preparation

Datasets

  1. Download the preprocessed metadata for birds coco and extract them to data/
  2. Download the birds image data. Extract them to data/birds/
  3. Download coco2014 dataset and extract the images to data/coco/images/

Training

Code for training SEMA will be released soon. Hope to get your continued attention.

Evaluation

Download Pretrained Model

Evaluate SEMA

We synthesize about 30k images from the test descriptions and evaluate the FID between synthesized images and test images of each dataset.

  1. synthesize images by the given pretrained model
python sampling.py
  1. evaluate the FID score
python test_fid.py

Performance

The released model achieves better performance than SEMA paper version.

Model COCO-FID↓
SEMA w/o BERT (paper) 17.51
SEMA w/o BERT (released model) ~16.5
SEMA (paper) 16.31

The code is released for academic research use only. Please contact us if you have any questions. Bowen Yuan

Reference