Awesome Language-Guided Image Editing

A list of papers and other resources on language-guided image editing.

Datasets
Single-Turn Editing
Multi-Turn Editing

Datasets

None paired images

Oxford-102: The flowers consisting of 102 flower categories are chosen to be flower commonly occuring in the United Kingdom. Each class consists of between 40 and 258 images.

Paired images

CUB Bird: an image dataset with 6,033 photos of 200 bird species (mostly North American).
COCO: natural scenes.
Multi-Modal-CelebA-HQ: a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA dataset by following CelebA-HQ. Each image has high-quality segmentation mask, sketch, descriptive text, and image with transparent background.
GIER
CoDraw
i-CLEVER

Single-Turn Editing

Semantic Image Synthesis via Adversarial Learning. paper code
Hao Dong, Simiao Yu, Chao Wu, and Yike Guo. (Imperial College London)
Language-Based Image Editing with Recurrent Attentive Models. CVPR2018. paper code
Jianbo Chen, Yelong Shen, Jianfeng Gao, Jingjing Liu, and Xiaodong Liu. (UCB, MSR)
Learning to Globally Edit Images with Textual Description. ArXiv2018. paper code
Hai Wang, Jason D. Williams, and SingBing Kang. (TTIC, Apple, MSR)
Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language. NeurlPS2018. paper code
Seonghyeon Nam, Yunji Kim, and Seon Joo Kim. (Yonsei University)
Bilinear Representation for Language-Based Image Editing Using Conditional Generative Adversarial Networks. ICASSP2019 paper code
Xiaofeng Mao, Yuefeng Chen, Yuhong Li, Tao Xiong, Yuan He, and Hui Xue. (Alibaba)
ManiGAN: Text-Guided Image Manipulation. CVPR2020. paper code
Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, and Philip H. S. Torr. (Oxford, UHK)
A Benchmark and Baseline for Language-Driven Image Editing. ACCV2020. paper code
Jing Shi, Ning Xu, Trung Bui, Franck Dernoncourt, Zheng Wen, and Chenliang Xu. (University of Rochester, Adobe)
Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation. NeurIPS2020. paper code Bowen Li, Xiaojuan Qi, Philip H. S. Torr, Thomas Lukasiewicz. (Oxford, UHK)
TediGAN: Text-Guided Diverse Image Generation and Manipulation. ArXiv2020. paper] code] Multi-Modal-CelebA-HQ Weihao Xia, Yujiu Yang, Jing-Hao Xue, Baoyuan Wu. (Tsinghua, UCL, CUHKSZ)

Multi-Turn Editing

A Multimodal Dialogue System for Conversational Image Editing. NeurIPSW2018. paper
Tzu-Hsiang Lin, Trung Bui, Doo Soon Kim, and Jean Oh. (CMU, Adobe)
Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction. ICCV2019. paper code
Alaaeldin El-Nouby, Shikhar Sharma, Hannes Schulz, R Devon Hjelm, Layla El Asri, Samira Ebrahimi Kahou, Yoshua Bengio, and Graham Taylor. (University of Guelph, MSR, Vector Institute, University of Montreal)
Sequential Attention GAN for Interactive Image Editing. MM2020. paper
Yu Cheng, Zhe Gan, Yitong Li, Jingjing Liu, and Jianfeng Gao. (MSR, Duke University)
SSCR: Iterative Language-Based Image Editing via Self-Supervised Counterfactual Reasoning. EMNLP2020. paper code
Tsu-Jui Fu, Xin Eric Wang, Scott Grafton, Miguel Eckstein, and William Yang Wang. (UCSB, UCSC)
Text as Neural Operator: Image Manipulation by Text Instruction. CVPRW2020. paper
Tianhao Zhang, Hung-Yu Tseng, Lu Jiang, Honglak Lee, Irfan Essa, and Weilong Yang. (Google)

Contributor

Jing Shi
Any recommendations to add to the list are welcome :)
Feel free to make pull requests!

jshi31/awesome-language-guided-image-editing

Awesome Language-Guided Image Editing

Contents

Datasets

Single-Turn Editing

Multi-Turn Editing

Contributor