InstDiffEdit

This repository contains the implementation of the AAAI 2024 paper:

Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks [Paper] [AAAI]
Siyu Zou¹, Jiji Tang², Yiyi Zhou¹, Jing He¹, Chaoyi Zhao², Rongsheng Zhang², Zhipeng Hu², Xiaoshuai Sun¹
¹Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University
²Fuxi AI Lab, NetEase Inc., Hangzhou

Model Architecture

Code Path

Code Structures

There are four parts in the code.

model: It contains the implement files for InstDiffEdit, DiffEdit and SDEdit.
dataset_txt: It contains the data splits of Imagen, ImageNet and Editing-Mask dataset.
dataset: It contains the image and mask of Editing-Mask dataset.
.sh: The inference scripts for InstDiffEdit.

Dependencies

Python 3.8
PyTorch == 1.13.1
Transformers == 4.25.1
diffusers == 0.8.0
NumPy
All experiments are performed with one A30 GPU.

Datasets

There are two pdataset we used.

ImageNet: We follow the evaluation protocol of FlexIT (https://github.com/facebookresearch/semanticimagetranslation). We obtained 1092 test images and made changes to the image category.
Imagen: We use the 360 image with structured text prompts generated by Imagen(https://imagen.research.google/).
Editing-Mask: 200 images show in dataset.

Eval & Sample

Sample begin:

bash sample_begin.sh

Run in the Imagen or ImageNet or Editing-Mask:

bash run.sh

Note:

Diffedit and SDEdit can be used by the .sh file with some parameter changes.
you can open the .sh file for parameter modification.

xiaotianqing/InstDiffEdit

InstDiffEdit

Model Architecture

Code Path

Code Structures

Dependencies

Datasets

Eval & Sample