We use python=3.8, pytorch=2.1.0 and pytorch-lightning=1.9.0 in this project.
Please install those packages first, then install the rest packages in environment.yaml
.
You can also refer to the installation guidance of official ControlNet repo to set up the environment.
To clone glide repo, run
git submodule init && git submodule update
After this step, you can find glide-text2im/
directory under base directory.
And we expect you to install its requirements as well, just run
cd glide-text2im && pip install -e .
-
Download val2017.zip, annotations_trainval2017.zip, panoptic_val2017.zip, and unzip them under
./dataset
. -
cd ./dataset
, then runpython read_caption_data.py python read_mask_data.py python merge_data.py
the final results will be saved in
./dataset/data.pkl
-
You can run the following command to visualize the preprocessed data:
python read_pkl.py --idx IDX
IDX
should be a positive integer less than 5000.The results will be saved in
./dataset/tmp/
.
To inference using PLP+ControlNet, run
python control_infer.py --model_path path/to/your/ckpt --output_path path/to/output/folder
To inference using PLP+GLIDE, run
python inpaint.py --model_path path/to/your/ckpt --output_path path/to/output/folder
path/to/your/ckpt
is the path where you save your trained ControlNet weights. We have provided our ckpts in ./ckpts
folder.
Note that when inference with GLIGE, the ckpt is used only for mask prediction.
-
Download SDv1.5 weights "v1-5-pruned.ckpt" and place it under
./models
. -
If you have trouble in getting access to huggingfaceHub, download the openai/CLIP model weights manually and place it under
./openai/clip-vit-large-patch14
. The downloading link is here. -
Create the dataset following our tutorial. Then run
python pre_processing/dataset_process.py
to get the preprocessed dataset in./dataset/raw_data5k/
. -
Run
python seq_add_control.py ./models/v1-5-pruned.ckpt ./models/plp_ini.ckpt
to get initialized PLP Model from SD weights. -
Run
python train_plp.py
to start training. The training logs including inferenced images and loss curves should be in./image_log
and./lightning_logs
. You may refer to the Official Finetuning Guidance
See quantitative_evaluation_metric/README.md for more details.