/awesome-language-guided-image-editing

A list of papers and other resources on language-guided image editing.

Awesome Language-Guided Image Editing

A list of papers and other resources on language-guided image editing.

Contents

Datasets

None paired images

  • Oxford-102: The flowers consisting of 102 flower categories are chosen to be flower commonly occuring in the United Kingdom. Each class consists of between 40 and 258 images.

Paired images

  • CUB Bird: an image dataset with 6,033 photos of 200 bird species (mostly North American).
  • COCO: natural scenes.
  • Multi-Modal-CelebA-HQ: a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA dataset by following CelebA-HQ. Each image has high-quality segmentation mask, sketch, descriptive text, and image with transparent background.
  • GIER
  • CoDraw
  • i-CLEVER

Single-Turn Editing

  • Semantic Image Synthesis via Adversarial Learning. paper code
    Hao Dong, Simiao Yu, Chao Wu, and Yike Guo. (Imperial College London)
  • Language-Based Image Editing with Recurrent Attentive Models. CVPR2018. paper code
    Jianbo Chen, Yelong Shen, Jianfeng Gao, Jingjing Liu, and Xiaodong Liu. (UCB, MSR)
  • Learning to Globally Edit Images with Textual Description. ArXiv2018. paper code
    Hai Wang, Jason D. Williams, and SingBing Kang. (TTIC, Apple, MSR)
  • Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language. NeurlPS2018. paper code
    Seonghyeon Nam, Yunji Kim, and Seon Joo Kim. (Yonsei University)
  • Bilinear Representation for Language-Based Image Editing Using Conditional Generative Adversarial Networks. ICASSP2019 paper code
    Xiaofeng Mao, Yuefeng Chen, Yuhong Li, Tao Xiong, Yuan He, and Hui Xue. (Alibaba)
  • ManiGAN: Text-Guided Image Manipulation. CVPR2020. paper code
    Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, and Philip H. S. Torr. (Oxford, UHK)
  • A Benchmark and Baseline for Language-Driven Image Editing. ACCV2020. paper code
    Jing Shi, Ning Xu, Trung Bui, Franck Dernoncourt, Zheng Wen, and Chenliang Xu. (University of Rochester, Adobe)
  • Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation. NeurIPS2020. paper code Bowen Li, Xiaojuan Qi, Philip H. S. Torr, Thomas Lukasiewicz. (Oxford, UHK)
  • TediGAN: Text-Guided Diverse Image Generation and Manipulation. ArXiv2020. paper] code] Multi-Modal-CelebA-HQ Weihao Xia, Yujiu Yang, Jing-Hao Xue, Baoyuan Wu. (Tsinghua, UCL, CUHKSZ)

Multi-Turn Editing

  • A Multimodal Dialogue System for Conversational Image Editing. NeurIPSW2018. paper
    Tzu-Hsiang Lin, Trung Bui, Doo Soon Kim, and Jean Oh. (CMU, Adobe)

  • Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction. ICCV2019. paper code
    Alaaeldin El-Nouby, Shikhar Sharma, Hannes Schulz, R Devon Hjelm, Layla El Asri, Samira Ebrahimi Kahou, Yoshua Bengio, and Graham Taylor. (University of Guelph, MSR, Vector Institute, University of Montreal)

  • Sequential Attention GAN for Interactive Image Editing. MM2020. paper
    Yu Cheng, Zhe Gan, Yitong Li, Jingjing Liu, and Jianfeng Gao. (MSR, Duke University)

  • SSCR: Iterative Language-Based Image Editing via Self-Supervised Counterfactual Reasoning. EMNLP2020. paper code
    Tsu-Jui Fu, Xin Eric Wang, Scott Grafton, Miguel Eckstein, and William Yang Wang. (UCSB, UCSC)

  • Text as Neural Operator: Image Manipulation by Text Instruction. CVPRW2020. paper
    Tianhao Zhang, Hung-Yu Tseng, Lu Jiang, Honglak Lee, Irfan Essa, and Weilong Yang. (Google)

Contributor

Jing Shi
Any recommendations to add to the list are welcome :)
Feel free to make pull requests!