[ECCV'22] Language-Driven Artistic Style Transfer

A PyTorch implementation of LDAST

Overview

LDAST is an implementation of
"Language-Driven Artistic Style Transfer"
Tsu-Jui Fu, Xin Eric Wang, and William Yang Wang
in European Conference on Computer Vision (ECCV) 2022

Language visual artist (LVA) extracts content structures from C and visual patterns from X to perform LDAST. LVA adopts the patch-wise style discriminator D to connect extracted visual semantics to patches of paired style image (P_S). Contrastive reasoning (CR) allows comparing contrastive pairs C₁-X₁, C₂-X₁, and C₂-X₂ of content image and style instruction.

Requirements

This code is implemented under Python 3.8, PyTorch 1.7, and Torchvision 0.8.

tqdm
CLIP

Usage

Dataset

The dataset includes content images and visual attribute instructions (DTD).
Please visit WikiArt and here for emotional effect instructions (ArtEmis).

Train

Put sanet.pt in ./_ckpt and dtd.pkl in ./_data.

python train_lva.py
python train_ctr.py

Inference & GUI

Put clva_dtd.pt in ./_ckpt.

python inference.py
python gui.py

Citation

@inproceedings{fu2022ldast, 
  author = {Tsu-Jui Fu and Xin Eric Wang and William Yang Wang}, 
  title = {{Language-Driven Artistic Style Transfer}}, 
  booktitle = {European Conference on Computer Vision (ECCV)}, 
  year = {2022} 
}

Acknowledgement

This code is based on SANet

tsujuifu/pytorch_ldast