SeeSR: A Python repository from cswry

SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution (CVPR2024)

¹The Hong Kong Polytechnic University, ²OPPO Research Institute, ³ByteDance Inc.

⭐ If SeeSR is helpful to your images or projects, please help star this repo. Thanks! 🤗

🚩Accepted by CVPR2024

📢 News

2024.06 Our One-Step Real-ISR work OSEDiff, which achieves SeeSR-level quality but is over 10 times faster.
2024.03.10 Support sd-turbo, SeeSR can get a not bad image with only 2 steps ⚡️. Please refer to it.
2024.01.12 🔥🔥🔥 Integrated to Try out Replicate online demo ❤️ Thanks lucataco for the implementation.
2024.01.09 🚀 Add Gradio demo, including turbo mode.
2023.12.25 🎅🎄🎅🎄 Merry Christmas!!!
- 🍺 Release SeeSR-SD2-Base, including the codes and pretrained models.
- 📏 We also release RealLR200. It includes 200 real-world low-resolution images.
2023.11.28 Create this repo.

📌 TODO

SeeSR-SDXL
SeeSR-SD2-Base-face,text
~~SeeSR Acceleration~~

🔎 Overview framework

📷 Real-World Results

⚙️ Dependencies and Installation

## git clone this repository
git clone https://github.com/cswry/SeeSR.git
cd SeeSR

# create an environment with python >= 3.8
conda create -n seesr python=3.8
conda activate seesr
pip install -r requirements.txt

🚀 Quick Inference

Step 1: Download the pretrained models

Download the pretrained SD-2-base models from HuggingFace.
Download the SeeSR and DAPE models from GoogleDrive or OneDrive.

You can put the models into preset/models.

Step 2: Prepare testing data

You can put the testing images in the preset/datasets/test_datasets.

Step 3: Running testing command

python test_seesr.py \
--pretrained_model_path preset/models/stable-diffusion-2-base \
--prompt '' \
--seesr_model_path preset/models/seesr \
--ram_ft_path preset/models/DAPE.pth \
--image_path preset/datasets/test_datasets \
--output_dir preset/datasets/output \
--start_point lr \
--num_inference_steps 50 \
--guidance_scale 5.5 \
--process_size 512

More details are here

Step-sd-turbo

Just download the weights from sd-turbo, and put them into preset/models. Then, you can run the command. More comparisons can be found here. Note that the guidance_scale is fixed to 1.0 in turbo mode.

python test_seesr_turbo.py \
--pretrained_model_path preset/models/sd-turbo \
--prompt '' \
--seesr_model_path preset/models/seesr \
--ram_ft_path preset/models/DAPE.pth \
--image_path preset/datasets/test_datasets \
--output_dir preset/datasets/output \
--start_point lr \
--num_inference_steps 2 \
--guidance_scale 1.0 \
--process_size 512

Note

Please read the arguments in test_seesr.py carefully. We adopt the tiled vae method proposed by multidiffusion-upscaler-for-automatic1111 to save GPU memory.

Gradio Demo

Please put the all pretrained models at preset/models, and then run the following command to interact with the gradio website.

python gradio_seesr.py

We also provide gradio with sd-turbo, have fun. 🤗

python gradio_seesr_turbo.py

Test Benchmark

We release our RealLR200 at GoogleDrive and OneDrive. You can download RealSR and DRealSR from StableSR. We also provide the copy of that at GoogleDrive and OneDrive. As for the synthetic test set, you can obtain it through the synthetic methods described below.

🌈 Train

Step1: Download the pretrained models

Download the pretrained SD-2-base models and RAM. You can put them into preset/models.

Step2: Prepare training data

We pre-prepare training data pairs for the training process, which would take up some memory space but save training time. We train the DAPE with COCO and train the SeeSR with LSDIR+FFHQ10k.

For making paired data when training DAPE, you can run:

python utils_data/make_paired_data_DAPE.py \
--gt_path PATH_1 PATH_2 ... \
--save_dir preset/datasets/train_datasets/training_for_dape \
--epoch 1

For making paired data when training SeeSR, you can run:

python utils_data/make_paired_data.py \
--gt_path PATH_1 PATH_2 ... \
--save_dir preset/datasets/train_datasets/training_for_dape \
--epoch 1

--gt_path the path of gt images. If you have multi gt dirs, you can set it as PATH1 PATH2 PATH3 ...
--save_dir the path of paired images
--epoch the number of epoch you want to make

The difference between make_paired_data_DAPE.py and make_paired_data.py lies in that make_paired_data_DAPE.py resizes the entire image to a resolution of 512, while make_paired_data.py randomly crops a sub-image with a resolution of 512.

Once the degraded data pairs are created, you can base them to generate tag data by running utils_data/make_tags.py.

The data folder should be like this:

your_training_datasets/
    └── gt
        └── 0000001.png # GT images, (512, 512, 3)
        └── ...
    └── lr
        └── 0000001.png # LR images, (512, 512, 3)
        └── ...
    └── tag
        └── 0000001.txt # tag prompts
        └── ...

Step3: Training for DAPE

Please specify the DAPE training data path at line 13 of basicsr/options/dape.yaml, then run the training command:

python basicsr/train.py -opt basicsr/options/dape.yaml

You can modify the parameters in dape.yaml to adapt to your specific situation, such as the number of GPUs, batch size, optimizer selection, etc. For more details, please refer to the settings in Basicsr.

Step4: Training for SeeSR

CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7," accelerate launch train_seesr.py \
--pretrained_model_name_or_path="preset/models/stable-diffusion-2-base" \
--output_dir="./experience/seesr" \
--root_folders 'preset/datasets/training_datasets' \
--ram_ft_path 'preset/models/DAPE.pth' \
--enable_xformers_memory_efficient_attention \
--mixed_precision="fp16" \
--resolution=512 \
--learning_rate=5e-5 \
--train_batch_size=2 \
--gradient_accumulation_steps=2 \
--null_text_ratio=0.5 
--dataloader_num_workers=0 \
--checkpointing_steps=10000

--pretrained_model_name_or_path the path of pretrained SD model from Step 1
--root_folders the path of your training datasets from Step 2
--ram_ft_path the path of your DAPE model from Step 3

The overall batch size is determined by num of CUDA_VISIBLE_DEVICES, --train_batch_size, and --gradient_accumulation_steps collectively. If your GPU memory is limited, you can consider reducing --train_batch_size while increasing --gradient_accumulation_steps.

❤️ Acknowledgments

This project is based on diffusers and BasicSR. Some codes are brought from PASD and RAM. Thanks for their awesome works. We also pay tribute to the pioneering work of StableSR.

📧 Contact

If you have any questions, please feel free to contact: rong-yuan.wu@connect.polyu.hk

🎓Citations

If our code helps your research or work, please consider citing our paper. The following are BibTeX references:

@inproceedings{wu2024seesr,
  title={Seesr: Towards semantics-aware real-world image super-resolution},
  author={Wu, Rongyuan and Yang, Tao and Sun, Lingchen and Zhang, Zhengqiang and Li, Shuai and Zhang, Lei},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  pages={25456--25467},
  year={2024}
}

🎫 License

This project and related weights are released under the Apache 2.0 license.

statistics