/FireFlow-Fast-Inversion-of-Rectified-Flow-for-Image-Semantic-Editing

An 8-step inversion and 8-step editing process works effectively with the FLUX-dev model. (3x speedup with results that are comparable or even superior to baseline methods)

Primary LanguagePythonApache License 2.0Apache-2.0

🔥FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing

TL;DR: An 8-step inversion and 8-step editing process works effectively with the FLUX-dev model. (3x speedup with results that are comparable or even superior to baseline methods)

arXiv Huggingface space

[0] ✉️ News

[1] 👀 Preview

Inspired by recent ReFlow based editing approaches, we propose a novel numerical solver for ReFlow models, achieving second-order precision at the computational cost of a first-order method, providing a scalable and efficient solution for tasks such as image reconstruction and semantic editing.

[2] How to install?

The code environment is consistent with FLUX and the pioneering RF-Solver-Edit. You can refer to the official FLUX repository for details or use the following command to set up the environment.

conda create --name Fireflow python=3.10
conda activate Fireflow
pip install -e ".[all]"

[3] Demo Scripts: Inversion and Reconstruction

We have provided a script to reproduce the results presented in the paper. Additional comparison results can be found in inversion_reconstruction.sh.

python edit.py  --source_prompt "A young boy is playing with a toy airplane on the grassy front lawn of a suburban house, with a blue sky and fluffy clouds above." \
                --target_prompt "A young boy is playing with a toy airplane on the grassy front lawn of a suburban house, with a blue sky and fluffy clouds above." \
                --guidance 1 \
                --source_img_dir 'examples/source/boy.jpg' \
                --num_steps 10 \
                --offload \
                --inject 0 \
                --sampling_strategy 'fireflow' \
                --output_prefix 'fireflow' \
                --output_dir 'examples/edit-result/dog' 

[4] Demo Scripts: Semantic Image Editing

We have also provided several scripts to reproduce the results presented in the paper, focusing on three main types of editing: stylization, addition, and replacement.

[4.1] Stylization

Ref Style
Editing Scripts Lincoln Ironman Ironman (More Expressive) Sir Charles Chaplin
Edtied image
Editing Scripts Obama Ultraman Ultraman (More Expressive) Draco Malfoy
Edtied image
Editing Scripts Trump Marilyn Monroe Marilyn Monroe (More Expressive) Einstein
Edtied image
Editing Scripts Biden Batman Batman (More Expressive) Harry Potter
Edtied image

[4.2] Adding & Replacing & Removing

Source image
Editing Scripts + hiking stick horse -> camel + dog
Edtied image
Editing Scripts woman -> man + boy - plane
Edtied image

[4.3] 🖌️ Edit your own image

[4.3.1] 🎨 Using GUI

Use the following script to launch Gradio for fast editing:

pip install gradio
CUDA_VISIBLE_DEVICES=0 python gradio_demo.py # If OOM, add '--offload'

[4.3.2] 🖥️ Using CLI

Use the following script to perform fast editing:

cd src
python edit.py  --source_prompt [describe the content of your image or leaves it as null] \
                --target_prompt [describe your editing requirements] \
                --guidance 2 \
                --source_img_dir [the path of your source image] \
                --num_steps 8  \
                --inject 1 \
                --start_layer_index 0 \
                --end_layer_index 37 \
                --name 'flux-dev' \
                --sampling_strategy 'fireflow' \
                --output_prefix 'fireflow' \
                --offload \
                --output_dir [output path] 

Tip

If the above code fails to capture the instructions (such as "can not edit the color"), we provide several settings to help address the issues.

  • Add More Steps / Enlarge Guidance
python edit.py  --source_prompt [describe the content of your image or leaves it as null] \
                --target_prompt [describe your editing requirements] \
                --guidance 3  \ # 2 -> 3
                --source_img_dir [the path of your source image] \
                --num_steps 15 \ # 8 -> 15
                --inject 1 \
                --start_layer_index 0 \
                --end_layer_index 37 \
                --name 'flux-dev' \
                --sampling_strategy 'fireflow' \
                --output_prefix 'fireflow' \
                --offload \
                --output_dir [output path] 
  • Using Other Editing Strategies (at the cost of losing the original structure)
python edit.py  --source_prompt [describe the content of your image or leaves it as null] \
                --target_prompt [describe your editing requirements] \
                --guidance 2 \
                --source_img_dir [the path of your source image] \
                --num_steps 8 \
                --inject 1 \
                --start_layer_index 0 \
                --end_layer_index 37 \
                --name 'flux-dev' \
                --sampling_strategy 'fireflow' \
                --output_prefix 'fireflow' \
                --reuse_v 0 \ # 1 -> 0 to disable default editing strategy
                --editing_strategy 'add_q' \ # 'replace_v' -> 'add_q' / 'add_k' / 'add_v'
                --offload \
                --output_dir [output path] 

Tip

If the above code fails to perserve the original image, we provide several settings to help address the issues.

  • Add More Steps / More injected steps
python edit.py  --source_prompt [describe the content of your image or leaves it as null] \
                --target_prompt [describe your editing requirements] \
                --guidance 2  \
                --source_img_dir [the path of your source image] \
                --num_steps 15 \ # 8 -> 15
                --inject 2 \ # 1 -> 2
                --start_layer_index 0 \
                --end_layer_index 37 \
                --name 'flux-dev' \
                --sampling_strategy 'fireflow' \
                --output_prefix 'fireflow' \
                --offload \
                --output_dir [output path] 
  • Using Other Editing Strategies (at the cost of losing the control)
python edit.py  --source_prompt [describe the content of your image or leaves it as null] \
                --target_prompt [describe your editing requirements] \
                --guidance 2 \
                --source_img_dir [the path of your source image] \
                --num_steps 8 \
                --inject 1 \
                --start_layer_index 0 \
                --end_layer_index 37 \
                --name 'flux-dev' \
                --sampling_strategy 'fireflow' \
                --output_prefix 'fireflow' \
                --reuse_v 0 \ # 1 -> 0 to disable default editing strategy
                --editing_strategy 'add_q replace_v' \ # 'replace_v' -> 'add_q replace_v' / 'add_k replace_v' / 'add_q add_k replace_v'
                --offload \
                --output_dir [output path] 

[5] 💖 Acknowledgements

We sincerely thank RF-Solver and FLUX for their well-structured codebases. The support and contributions of the open-source community have been invaluable, and without their efforts, completing our work so efficiently would not have been possible.

Furthermore, I would like to extend my sincere thanks to the owner of RF-Solver's Repo for their prompt and helpful responses to all my questions regarding the code and the ideas presented in their paper. Their support has been invaluable and has greatly assisted me in my work.

[6] 🤝🏼 Cite Us

@misc{deng2024fireflowfastinversionrectified,
      title={FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing}, 
      author={Yingying Deng and Xiangyu He and Changwang Mei and Peisong Wang and Fan Tang},
      year={2024},
      eprint={2412.07517},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2412.07517}, 
}