DiffEdit

An unofficial implementation of DiffEdit based on 🤗 Hugging Face , this repo and PyTorch. This methodology leverage the diffusion process to automatically extract a mask from an image given a prompt. The mask is then used to inpaint the image with the new content. To get a clearer overview of the process, you can take a look at the DiffEdit.ipynb notebook.

The wheel for this repo is avaivalable here:

Results

Prompt: remove ⟶ add)	Original image	Mask	Edited
"lion" ⟶ "dog"
"house" ⟶ "3-floor hotel"
"an F1 race" ⟶ "a motogp race"

All the previous masks was generated with: num-samples = 10

Installation

You can install the package in different ways, depending on your needs.

Optional step (recommended)

Create a virtual environment, to avoid conflicts with other packages. Here are some alternatives:

with venv:

python -m venv venv
source venv/bin/activate

with poetry:

poetry shell

with conda:

conda create -n diff-edit python=3.10
conda activate diff-edit

Install the package from PyPi:

pip install diff-edit

Alternative ways

Install the package from source:

poetry install

Install the package in editable mode, suggested for further development:

pip install -e .

Usage

For a fast evaluation use the script image_edit.py:

python image_edit.py --input_image <path_to_image> --output_image <path_to_output_image> --prompt <prompt>

An example of usage is the following (resulting in this image):

python image_edit.py --remove-prompt "lion" --add-prompt "dog" --image-link "https://github.com/Gennaro-Farina/DiffEdit/blob/main/static/ai_gen_lion.jpeg" --num-samples 10

or you can use the Command Line Interface (CLI) to interact with the script:

diff-edit --remove-prompt "lion" --add-prompt "dog" --image-link "https://github.com/Gennaro-Farina/DiffEdit/blob/main/static/ai_gen_lion.jpeg" --num-samples 10

The script has the following options:

python image_edit.py --help
usage: image_edit.py [-h] [--remove-prompt REMOVE_PROMPT] [--add-prompt ADD_PROMPT] [--image IMAGE] [--image-link IMAGE_LINK] [--device {cpu,cuda,mps}]
                     [--vae-model VAE_MODEL] [--tokenizer TOKENIZER] [--text-encoder TEXT_ENCODER] [--unet UNET] [--scheduler SCHEDULER]
                     [--scheduler-start SCHEDULER_START] [--scheduler-end SCHEDULER_END] [--num-train-timesteps NUM_TRAIN_TIMESTEPS] [--beta-schedule BETA_SCHEDULE]
                     [--inpainting INPAINTING] [--seed SEED] [--n N] [--save-path SAVE_PATH]

Diffusion Image Editing arguments

options:
  -h, --help            show this help message and exit
  --remove-prompt REMOVE_PROMPT
                        What you want to remove from the image
  --add-prompt ADD_PROMPT
                        What you want to add to the image
  --image IMAGE         Path to the image to edit
  --image-link IMAGE_LINK
                        Link to the image to edit
  --device {cpu,cuda,mps}
  --vae-model VAE_MODEL
                        Model name. E.g. stabilityai/sd-vae-ft-ema
  --tokenizer TOKENIZER
                        Tokenizer to tokenize the text. E.g. openai/clip-vit-large-patch14
  --text-encoder TEXT_ENCODER
                        Text encoder to encode the text. E.g. openai/clip-vit-large-patch14
  --unet UNET           UNet model for generating the latents. E.g. CompVis/stable-diffusion-v1-4
  --scheduler SCHEDULER
                        Noise scheduler. E.g. LMSDiscreteScheduler
  --scheduler-start SCHEDULER_START
                        Scheduler start value
  --scheduler-end SCHEDULER_END
                        Scheduler end value
  --num-train-timesteps NUM_TRAIN_TIMESTEPS
                        Number of training timesteps
  --beta-schedule BETA_SCHEDULE
                        Beta schedule
  --inpainting INPAINTING
                        Inpainting model. E.g. runwayml/stable-diffusion-inpainting
  --seed SEED           Random seed
  --num-samples N       Number of diffusion steps to generate the mask
  --save-path SAVE_PATH
                        Path to save the result. Default is <script_folder>/result.png