Using GANs for generating images from text: Stacked GANs
Stacked Generative Adversarial Networks for synthesizing photo-realistic images from text descriptions. It decomposes the difficult problem of generating high-resolution images into more manageable subproblems and significantly improve standard GANs.
Read the paper for architecture information: https://arxiv.org/pdf/1612.03242v2.pdf
tldr; Divided into two GANs: A Generator and a Discriminator Network.
Dependencies
python 2.7
[Optional] Torch is needed, if use the pre-trained char-CNN-RNN text encoder.
[Optional] skip-thought is needed, if use the skip-thought text encoder.
pip install
the following packages:
prettytensor
progressbar
python-dateutil
easydict
pandas
torchfile
Usage
Data
- Download our preprocessed char-CNN-RNN text embeddings for birds and flowers and save them to
Data/
.
- [Optional] Follow the instructions reedscot/icml2016 to download the pretrained char-CNN-RNN text encoders and extract text embeddings.
- Download the birds and flowers image data. Extract them to
Data/birds/
andData/flowers/
, respectively. - Preprocess images.
- For birds:
python misc/preprocess_birds.py
- For flowers:
python misc/preprocess_flowers.py
Training
- The steps to train a StackGAN model on the CUB dataset using our preprocessed data for birds.
- Step 1: train Stage-I GAN (e.g., for 600 epochs)
python stageI/run_exp.py --cfg stageI/cfg/birds.yml --gpu 0
- Step 2: train Stage-II GAN (e.g., for another 600 epochs)
python stageII/run_exp.py --cfg stageII/cfg/birds.yml --gpu 1
- Step 1: train Stage-I GAN (e.g., for 600 epochs)
- Change
birds.yml
toflowers.yml
to train a StackGAN model on Oxford-102 dataset using our preprocessed data for flowers. *.yml
files are example configuration files for training/testing our models.- If you want to try your own datasets, here are some good tips about how to train GAN. Also, we encourage to try different hyper-parameters and architectures, especially for more complex datasets.
Pretrained Model
- StackGAN for birds trained from char-CNN-RNN text embeddings. Download and save it to
models/
. - StackGAN for flowers trained from char-CNN-RNN text embeddings. Download and save it to
models/
. - StackGAN for birds trained from skip-thought text embeddings. Download and save it to
models/
(Just used the same setting as the char-CNN-RNN. We assume better results can be achieved by playing with the hyper-parameters).
Run Demos
- Run
sh demo/flowers_demo.sh
to generate flower samples from sentences. The results will be saved toData/flowers/example_captions/
. (Need to download the char-CNN-RNN text encoder for flowers tomodels/text_encoder/
. Note: this text encoder is provided by reedscot/icml2016). - Run
sh demo/birds_demo.sh
to generate bird samples from sentences. The results will be saved toData/birds/example_captions/
.(Need to download the char-CNN-RNN text encoder for birds tomodels/text_encoder/
. Note: this text encoder is provided by reedscot/icml2016). - Run
python demo/birds_skip_thought_demo.py --cfg demo/cfg/birds-skip-thought-demo.yml --gpu 2
to generate bird samples from sentences. The results will be saved toData/birds/example_captions-skip-thought/
. (Need to download vocabulary for skip-thought vectors toData/skipthoughts/
).
Examples for birds (char-CNN-RNN embeddings), more on youtube:
Examples for flowers (char-CNN-RNN embeddings), more on youtube:
Save your favorite pictures generated by the models since the randomness from noise z and conditioning augmentation makes them creative enough to generate objects with different poses and viewpoints from the same description 😃
Credits
The credits for this code go to hanzhanggit and Siraj Raval.