/AnyDoor

Implementações oficiais para papel: Anydoor: personalização de imagem em nível de objeto com disparo zero

Primary LanguagePythonMIT LicenseMIT

AnyDoor: Zero-shot Object-level Image Customization

Xi Chen · Lianghua Huang · Yu Liu · Yujun Shen · Deli Zhao · Hengshuang Zhao

Paper PDF Project Page
The University of Hong Kong   |   Alibaba Group |   Ant Group

News

  • [2023.12.17] Release train & inference & demo code, and pretrained checkpoint.
  • [Soon] Release the new version paper.
  • [Soon] Support online demo.
  • [On-going] Scale-up the training data and release stronger models as the foundaition model for downstream region-to-region generation tasks.
  • [On-going] Release specific-designed models for downstream tasks like virtual tryon, face swap, text and logo transfer, etc.

Installation

Install with conda:

conda env create -f environment.yaml
conda activate anydoor

or pip:

pip install -r requirements.txt

Additionally, for training, you need to install panopticapi, pycocotools, and lvis-api.

pip install git+https://github.com/cocodataset/panopticapi.git

pip install pycocotools -i https://pypi.douban.com/simple

pip install lvis

Download Checkpoints

Download AnyDoor checkpoint:

Download DINOv2 checkpoint and revise /configs/anydoor.yaml for the path (line 83)

Download Stable Diffusion V2.1 if you want to train from scratch.

Inference

We provide inference code in run_inference.py (from Line 222 - ) for both inference single image and inference a dataset (VITON-HD Test). You should modify the data path and run the following code. The generated results are provided in examples/TestDreamBooth/GEN for single image, and VITONGEN for VITON-HD Test.

python run_inference.py

The inferenced results on VITON-Test would be like [garment, ground truth, generation].

Noticing that AnyDoor does not contain any specific design/tuning for tryon, we think it would be helpful to add skeleton infos or warped garment, and tune on tryon data to make it better :)

Our evaluation data for DreamBooth an COCOEE coud be downloaded at Google Drive:

  • URL: [to be released]

Gradio demo

Currently, we suport local gradio demo. To launch it, you should firstly modify /configs/demo.yaml for the path to the pretrained model, and /configs/anydoor.yaml for the path to DINOv2(line 83).

Afterwards, run the script:

python run_gradio_demo.py

The gradio demo would look like the UI shown below:

Train

Prepare datasets

  • Download the datasets that present in /configs/datasets.yaml and modify the corresponding paths.
  • You could prepare you own datasets according to the formates of files in ./datasets.
  • If you use UVO dataset, you need to process the json following ./datasets/Preprocess/uvo_process.py
  • You could refer to run_dataset_debug.py to verify you data is correct.

Prepare initial weight

  • If your would like to train from scratch, convert the downloaded SD weights to control copy by running:
sh ./scripts/convert_weight.sh  

Start training

  • Modify the training hyper-parameters in run_train_anydoor.py Line 26-34 according to your training resources. We verify that using 2-A100 GPUs with batch accumulation=1 could get satisfactory results after 300,000 iterations.

  • Start training by executing:

sh ./scripts/train.sh  

Acknowledgements

This project is developped on the codebase of ControlNet. We appreciate this great work!

Citation

If you find this codebase useful for your research, please use the following entry.

@article{chen2023anydoor,
  title={Anydoor: Zero-shot object-level image customization},
  author={Chen, Xi and Huang, Lianghua and Liu, Yu and Shen, Yujun and Zhao, Deli and Zhao, Hengshuang},
  journal={arXiv preprint arXiv:2307.09481},
  year={2023}
}