/sText2Image

Sketch-Guided Text-to-Image Generation

Primary LanguagePythonMIT LicenseMIT

Sketch-Guided Text-to-Image Generation (in progress)

Introduction

Our goal is to generate photo-realistic images from given texts and freehand sketches, where texts provide the contents and sketches control the shapes. Freehand sketch can be highly abstract (examples shown below), and learning representations of sketches is not trivial. In contrast to other cross domain learning approaches, like pix2pix and CycleGAN, where a mapping from representations in one domain to those in another domain is learned, we propose to learn a joint representation of text, sketch and image.

face bird shoe

* A few freehand sketches were collected from volunteers.

Contributors:

  • Major Contributor: Shangzhe Wu (HKUST), Yongyi Lu (HKUST)
  • Supervisor: Yu-wing Tai (Tencent), Chi-Keung Tang (HKUST)
  • Mentor in MLJejuCamp2017: Hyungjoo Cho

MLJejuCamp2017

Part of the project was developed in Machine Learning Camp Jeju 2017. More interesting projects can be found in project descriptions and program GitHub.

Get Started

Prerequisites

Setup

  • Clone this repo:
git clone https://github.com/elliottwu/sText2Image.git
cd sText2Image
  • Download preprocessed CelebA data (~3GB):
sh ./datasets/download_dataset.sh

Train

sh train.sh
  • To monitor training using Tensorboard, copy the following to your terminal and open localhost:8888 in your browser
tensorboard --logdir=logs_face --port=8888

Test

sh test.sh

Pretrained Model

  • Download pretrained model:
sh download_pretrained_model.sh
  • Test pretrained model on CelebA dataset:
python test.py ./datasets/celeba/test/* --checkpointDir checkpoints_face_pretrained --maskType right --batchSize 64 --lam1 100 --lam2 1 --lam3 0.1 --lr 0.001 --nIter 1000 --outDir results_face_pretrained --text_vector_dim 18 --text_path datasets/celeba/imAttrs.pkl

Experiments

We test our framework with 3 kinds of data, face(CelebA), bird(CUB), and flower(Oxford-102). So far, we have only experimented with face images using attribute vectors as texts information. Here are some preliminary results:

1. Face

We used CelebA dataset, which also provides 40 attributes for each image. Similar to the text information, attributes control the specific details of the generated images. We chose 18 attrbutes for training.

a). Attributes match sketch:

The following images were generated given sketches and the corresponding attriubtes.

Mustache
attributes sketch / generated / gt attributes sketch / generated / gt
Male, 5_o_Clock_Shadow, Mouth_Open, Pointy_Nose Male, 5_o_Clock_Shadow, Big_Nose, Mustache
Male, Big_Lips, Big_Nose, Chubby, Goatee, High_Cheekbones, Smiling Male, Mustache
Male, Goatee, Mouth_Open, Smiling Male, Big_Nose, Goatee, Smiling
Male, 5_o_Clock_Shadow, Big_Lips, Big_Nose, Goatee, High_Cheekbones, Mouth_Open, Rosy_Cheeks, Smiling Male, 5_o_Clock_Shadow, Big_Nose, Narrow_Eyes
Eyeglasses
attributes sketch / generated / gt attributes sketch / generated / gt
Male, Big_Nose, Eyeglasses, Goatee Female, Eyeglasses
Female, Eyeglasses, High_Cheekbones, Mouth_Open, Smiling Male, 5_o_Clock_Shadow, Big_Nose, Eyeglasses, Mouth_Open, Smiling
Male, Big_Nose, Double_Chin, Eyeglasses, Mouth_Open, Pointy_Nose, Smiling Male, Eyeglasses, High_Cheekbones, Mouth_Open, Smiling
Male, 5_o_Clock_Shadow, Eyeglasses, Mouth_Open, Smiling Male, Big_Lips, Big_Nose, Eyeglasses, Goatee, Mouth_Open
Lipstick
attributes sketch / generated / gt attributes sketch / generated / gt
Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Pointy_Nose, Smiling, Wearing_Lipstick Female, Heavy_Makeup, Mouth_Open, Wearing_Lipstick
Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Pointy_Nose, Smiling, Wearing_Lipstick Female, Heavy_Makeup, Pointy_Nose, Smiling, Wearing_Lipstick
Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Pointy_Nose, Smiling, Wearing_Lipstick Female, Big_Lips, Big_Nose, Heavy_Makeup, High_Cheekbones, Mouth_Open, Rosy_Cheeks, Smiling, Wearing_Lipstick
Female, Heavy_Makeup, Pointy_Nose, Wearing_Lipstick Female, Heavy_Makeup, Mouth_Open, Smiling, Wearing_Lipstick

b). Attributes mismatch sketch:

The following images were generated given sketches and the random attriubtes. The controlling effects of the attributes are still under improvement.

attributes sketch / generated attributes sketch / generated attributes sketch / generated
Female, Big_Lips, Heavy_Makeup, Wearing_Lipstick Female, Big_Lips, Heavy_Makeup, Wearing_Lipstick Male, Big_Nose, No_Eyeglasses
Male, Big_Nose, Chubby, Double_Chin, High_Cheekbones, Smiling Male, Big_Nose, Chubby, Double_Chin, High_Cheekbones, Mouth_Open, Smiling Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Smiling, Wearing_Lipstick
Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Smiling, Wearing_Lipstick Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Smiling, Wearing_Lipstick Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Smiling, Wearing_Lipstick
Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Smiling, Wearing_Lipstick, No_Eyeglasses Male Female, Heavy_Makeup, Pale_Skin, Wearing_Lipstick
Female, Heavy_Makeup, High_Cheekbones, Pointy_Nose, Smiling, Wearing_Lipstick, No_Eyeglasses Female, Heavy_Makeup, High_Cheekbones, Pointy_Nose, Smiling, Wearing_Lipstick Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Pointy_Nose, Rosy_Cheeks, Smiling, Wearing_Lipstick

c). Freehand sketch:

The following images were generated given freehand sketches and the random attriubtes. The controlling effects of the attributes are still under improvement.

attributes sketch / generated attributes sketch / generated attributes sketch / generated
Female, Big_Lips, Heavy_Makeup, Wearing_Lipstick Male, Big_Nose Male, Big_Nose, Chubby, Double_Chin, High_Cheekbones, Mouth_Open, Smiling
Male, Big_Nose, Chubby, Double_Chin, High_Cheekbones, Mouth_Open, Smiling Male, Big_Nose, Chubby, Double_Chin, High_Cheekbones, Mouth_Open, Smiling Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Smiling, Wearing_Lipstick
Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Smiling, Wearing_Lipstick Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Smiling, Wearing_Lipstick Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Smiling, Wearing_Lipstick
Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Smiling, Wearing_Lipstick Female, Big_Lips, Heavy_Makeup, High_Cheekbones, Mouth_Open, Narrow_Eyes, Smiling, Wearing_Lipstick Female, Big_Lips, Heavy_Makeup, High_Cheekbones, Mouth_Open, Narrow_Eyes, Smiling, Wearing_Lipstick
Female, Heavy_Makeup, High_Cheekbones, Pointy_Nose, Smiling, Wearing_Lipstick Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Smiling, Wearing_Lipstick Female, Heavy_Makeup, High_Cheekbones, Mouth_Open, Smiling, Wearing_Lipstick

Acknowledgement

Codes are based on DCGAN and dcgan-completion.