An unofficial implementation of DiffEdit based on 🤗 Hugging Face , this repo and PyTorch. This methodology leverage the diffusion process to automatically extract a mask from an image given a prompt. The mask is then used to inpaint the image with the new content. To get a clearer overview of the process, you can take a look at the DiffEdit.ipynb notebook.
The wheel for this repo is avaivalable here:
Prompt: remove ⟶ add) | Original image | Mask | Edited |
---|---|---|---|
"lion" ⟶ "dog" | |||
"house" ⟶ "3-floor hotel" | |||
"an F1 race" ⟶ "a motogp race" |
All the previous masks was generated with: num-samples = 10
You can install the package in different ways, depending on your needs.
Optional step (recommended)
Create a virtual environment, to avoid conflicts with other packages. Here are some alternatives:
- with
venv
:
python -m venv venv
source venv/bin/activate
- with
poetry
:
poetry shell
- with
conda
:
conda create -n diff-edit python=3.10
conda activate diff-edit
Install the package from PyPi:
pip install diff-edit
Alternative ways
Install the package from source:
poetry install
Install the package in editable mode, suggested for further development:
pip install -e .
For a fast evaluation use the script image_edit.py:
python image_edit.py --input_image <path_to_image> --output_image <path_to_output_image> --prompt <prompt>
An example of usage is the following (resulting in this image):
python image_edit.py --remove-prompt "lion" --add-prompt "dog" --image-link "https://github.com/Gennaro-Farina/DiffEdit/blob/main/static/ai_gen_lion.jpeg" --num-samples 10
or you can use the Command Line Interface (CLI) to interact with the script:
diff-edit --remove-prompt "lion" --add-prompt "dog" --image-link "https://github.com/Gennaro-Farina/DiffEdit/blob/main/static/ai_gen_lion.jpeg" --num-samples 10
The script has the following options:
python image_edit.py --help
usage: image_edit.py [-h] [--remove-prompt REMOVE_PROMPT] [--add-prompt ADD_PROMPT] [--image IMAGE] [--image-link IMAGE_LINK] [--device {cpu,cuda,mps}]
[--vae-model VAE_MODEL] [--tokenizer TOKENIZER] [--text-encoder TEXT_ENCODER] [--unet UNET] [--scheduler SCHEDULER]
[--scheduler-start SCHEDULER_START] [--scheduler-end SCHEDULER_END] [--num-train-timesteps NUM_TRAIN_TIMESTEPS] [--beta-schedule BETA_SCHEDULE]
[--inpainting INPAINTING] [--seed SEED] [--n N] [--save-path SAVE_PATH]
Diffusion Image Editing arguments
options:
-h, --help show this help message and exit
--remove-prompt REMOVE_PROMPT
What you want to remove from the image
--add-prompt ADD_PROMPT
What you want to add to the image
--image IMAGE Path to the image to edit
--image-link IMAGE_LINK
Link to the image to edit
--device {cpu,cuda,mps}
--vae-model VAE_MODEL
Model name. E.g. stabilityai/sd-vae-ft-ema
--tokenizer TOKENIZER
Tokenizer to tokenize the text. E.g. openai/clip-vit-large-patch14
--text-encoder TEXT_ENCODER
Text encoder to encode the text. E.g. openai/clip-vit-large-patch14
--unet UNET UNet model for generating the latents. E.g. CompVis/stable-diffusion-v1-4
--scheduler SCHEDULER
Noise scheduler. E.g. LMSDiscreteScheduler
--scheduler-start SCHEDULER_START
Scheduler start value
--scheduler-end SCHEDULER_END
Scheduler end value
--num-train-timesteps NUM_TRAIN_TIMESTEPS
Number of training timesteps
--beta-schedule BETA_SCHEDULE
Beta schedule
--inpainting INPAINTING
Inpainting model. E.g. runwayml/stable-diffusion-inpainting
--seed SEED Random seed
--num-samples N Number of diffusion steps to generate the mask
--save-path SAVE_PATH
Path to save the result. Default is <script_folder>/result.png