TL;DR: An 8-step inversion and 8-step editing process works effectively with the FLUX-dev model. (3x speedup with results that are comparable or even superior to baseline methods)
- We are working with logtd to include this algorithm in ComfyUI-Fluxtapoz soon. Please refer to the discussion for more details.
- Online demo is available at HuggingFace Space.
- Local demo can be found in [4.3.1] Using GUI.
Inspired by recent ReFlow based editing approaches, we propose a novel numerical solver for ReFlow models, achieving second-order precision at the computational cost of a first-order method, providing a scalable and efficient solution for tasks such as image reconstruction and semantic editing.
The code environment is consistent with FLUX and the pioneering RF-Solver-Edit. You can refer to the official FLUX repository for details or use the following command to set up the environment.
conda create --name Fireflow python=3.10
conda activate Fireflow
pip install -e ".[all]"
We have provided a script to reproduce the results presented in the paper. Additional comparison results can be found in inversion_reconstruction.sh.
python edit.py --source_prompt "A young boy is playing with a toy airplane on the grassy front lawn of a suburban house, with a blue sky and fluffy clouds above." \
--target_prompt "A young boy is playing with a toy airplane on the grassy front lawn of a suburban house, with a blue sky and fluffy clouds above." \
--guidance 1 \
--source_img_dir 'examples/source/boy.jpg' \
--num_steps 10 \
--offload \
--inject 0 \
--sampling_strategy 'fireflow' \
--output_prefix 'fireflow' \
--output_dir 'examples/edit-result/dog'
We have also provided several scripts to reproduce the results presented in the paper, focusing on three main types of editing: stylization, addition, and replacement.
Ref Style | ||||
Editing Scripts | Lincoln | Ironman | Ironman (More Expressive) | Sir Charles Chaplin |
Edtied image | ||||
Editing Scripts | Obama | Ultraman | Ultraman (More Expressive) | Draco Malfoy |
Edtied image | ||||
Editing Scripts | Trump | Marilyn Monroe | Marilyn Monroe (More Expressive) | Einstein |
Edtied image | ||||
Editing Scripts | Biden | Batman | Batman (More Expressive) | Harry Potter |
Edtied image |
Source image | |||
Editing Scripts | + hiking stick | horse -> camel | + dog |
Edtied image | |||
Editing Scripts | woman -> man | + boy | - plane |
Edtied image |
Use the following script to launch Gradio for fast editing:
pip install gradio
CUDA_VISIBLE_DEVICES=0 python gradio_demo.py # If OOM, add '--offload'
Use the following script to perform fast editing:
cd src
python edit.py --source_prompt [describe the content of your image or leaves it as null] \
--target_prompt [describe your editing requirements] \
--guidance 2 \
--source_img_dir [the path of your source image] \
--num_steps 8 \
--inject 1 \
--start_layer_index 0 \
--end_layer_index 37 \
--name 'flux-dev' \
--sampling_strategy 'fireflow' \
--output_prefix 'fireflow' \
--offload \
--output_dir [output path]
Tip
If the above code fails to capture the instructions (such as "can not edit the color"), we provide several settings to help address the issues.
- Add More Steps / Enlarge Guidance
python edit.py --source_prompt [describe the content of your image or leaves it as null] \
--target_prompt [describe your editing requirements] \
--guidance 3 \ # 2 -> 3
--source_img_dir [the path of your source image] \
--num_steps 15 \ # 8 -> 15
--inject 1 \
--start_layer_index 0 \
--end_layer_index 37 \
--name 'flux-dev' \
--sampling_strategy 'fireflow' \
--output_prefix 'fireflow' \
--offload \
--output_dir [output path]
- Using Other Editing Strategies (at the cost of losing the original structure)
python edit.py --source_prompt [describe the content of your image or leaves it as null] \
--target_prompt [describe your editing requirements] \
--guidance 2 \
--source_img_dir [the path of your source image] \
--num_steps 8 \
--inject 1 \
--start_layer_index 0 \
--end_layer_index 37 \
--name 'flux-dev' \
--sampling_strategy 'fireflow' \
--output_prefix 'fireflow' \
--reuse_v 0 \ # 1 -> 0 to disable default editing strategy
--editing_strategy 'add_q' \ # 'replace_v' -> 'add_q' / 'add_k' / 'add_v'
--offload \
--output_dir [output path]
Tip
If the above code fails to perserve the original image, we provide several settings to help address the issues.
- Add More Steps / More injected steps
python edit.py --source_prompt [describe the content of your image or leaves it as null] \
--target_prompt [describe your editing requirements] \
--guidance 2 \
--source_img_dir [the path of your source image] \
--num_steps 15 \ # 8 -> 15
--inject 2 \ # 1 -> 2
--start_layer_index 0 \
--end_layer_index 37 \
--name 'flux-dev' \
--sampling_strategy 'fireflow' \
--output_prefix 'fireflow' \
--offload \
--output_dir [output path]
- Using Other Editing Strategies (at the cost of losing the control)
python edit.py --source_prompt [describe the content of your image or leaves it as null] \
--target_prompt [describe your editing requirements] \
--guidance 2 \
--source_img_dir [the path of your source image] \
--num_steps 8 \
--inject 1 \
--start_layer_index 0 \
--end_layer_index 37 \
--name 'flux-dev' \
--sampling_strategy 'fireflow' \
--output_prefix 'fireflow' \
--reuse_v 0 \ # 1 -> 0 to disable default editing strategy
--editing_strategy 'add_q replace_v' \ # 'replace_v' -> 'add_q replace_v' / 'add_k replace_v' / 'add_q add_k replace_v'
--offload \
--output_dir [output path]
We sincerely thank RF-Solver and FLUX for their well-structured codebases. The support and contributions of the open-source community have been invaluable, and without their efforts, completing our work so efficiently would not have been possible.
Furthermore, I would like to extend my sincere thanks to the owner of RF-Solver's Repo for their prompt and helpful responses to all my questions regarding the code and the ideas presented in their paper. Their support has been invaluable and has greatly assisted me in my work.
@misc{deng2024fireflowfastinversionrectified,
title={FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing},
author={Yingying Deng and Xiangyu He and Changwang Mei and Peisong Wang and Fan Tang},
year={2024},
eprint={2412.07517},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.07517},
}