DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

Yuang Ai^1,2 Xiaoqiang Zhou^1,4 Huaibo Huang^1,2 Xiaotian Han³ Zhengyu Chen³ Quanzeng You³ Hongxia Yang³

¹MAIS & NLPR, Institute of Automation, Chinese Academy of Sciences
²School of Artificial Intelligence, University of Chinese Academy of Sciences
³ByteDance, Inc ⁴University of Science and Technology of China

NeurIPS 2024

⭐ If DreamClear is helpful to your projects, please help star this repo. Thanks! 🤗

🔥 News

More convenient inference code&demo will be released in the coming days. Please stay tuned for updates, thanks!
2024.10.25: Release segmentation&detection code, pre-trained models.
2024.10.25: Release RealLQ250 benchmark, which contains 250 real-world LQ images.
2024.10.25: Release training&inference (256->1024) code, pre-trained models of DreamClear.
2024.10.24: This repo is created.

📸 Real-World IR Results

🔧 Dependencies and Installation

Clone this repo and navigate to DreamClear folder

git clone https://github.com/shallowdream204/DreamClear.git
cd DreamClear

Create Conda Environment and Install Package

conda create -n dreamclear python=3.9 -y
conda activate dreamclear
pip3 install -r requirements.txt

Download Pre-trained Models (All models can be downloaded at Huggingface for convenience.)

Base Model:
- PixArt-α-1024: PixArt-XL-2-1024-MS.pth
- VAE: sd-vae-ft-ema
- T5 Text Encoder: t5-v1_1-xxl
- SwinIR: general_swinir_v1.ckpt
Ours provided Model:
- DreamClear: DreamClear-1024.pth
- RMT for Segmentation: rmt_uper_s_2x.pth
- RMT for Detection: rmt_maskrcnn_s_1x.pth

🎰 Train

I - Prepare training data

Similar to SeeSR, We pre-prepare HQ-LQ image pairs for the training of IR model. Run the following command to make paired data for training:

python3 tools/make_paired_data.py \
--gt_path gt_path1 gt_path2 ... \ 
--save_dir /path/to/save/folder/ \
--epoch 1 # number of epochs to generate paired data

After generating paired data, you can use MLLM (e.g., LLaVA) to generate detailed text prompt for HQ images. Then you need to use T5 to extract text features in order to save training time. Run:

python3 tools/extract_t5_features.py \
--t5_ckpt /path/to/t5-v1_1-xxl \
--caption_folder /path/to/caption/folder \
--save_npz_folder /path/to/save/npz/folder

Finally, the directory structure for training datasets should look like

training_datasets_folder/
    └── gt
        └── 0000001.png # GT , (1024, 1024, 3)
        └── ...
    └── sr_bicubic
        └── 0000001.png # LQ + bicubic upsample, (1024, 1024, 3)
        └── ...
    └── caption
        └── 0000001.txt # Caption files (not used in training)
        └── ...
    └── npz
        └── 0000001.npz # T5 features
        └── ...

II - Training for DreamClear

Run the following command to train DreamClear with default settings:

python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=... --node_rank=... --master_addr=... --master_port=... \
    train_dreamclear.py configs/DreamClear/DreamClear_Train.py \
    --load_from /path/to/PixArt-XL-2-1024-MS.pth \
    --vae_pretrained /path/to/sd-vae-ft-ema \
    --swinir_pretrained /path/to/general_swinir_v1.ckpt \
    --val_image /path/to/RealLQ250/lq/val_image.png \
    --val_npz /path/to/RealLQ250/npz/val_image.npz \
    --work_dir experiments/train_dreamclear

Please modify the path of training datasets in configs/DreamClear/DreamClear_Train.py. You can also modify the training hyper-parameters (e.g., lr, train_batch_size, gradient_accumulation_steps) in this file, according to your own GPU machines.

⚡ Inference

We provide the RealLQ250 benchmark, which can be downloaded from Google Drive.

Testing DreamClear for Image Restoration

Run the following command to restore LQ images from 256 to 1024:

python3 -m torch.distributed.launch --nproc_per_node 1 --master_port 1234 \
    test_1024.py configs/DreamClear/DreamClear_Test.py \
    --dreamclear_ckpt /path/to/DreamClear-1024.pth \
    --swinir_ckpt /path/to/general_swinir_v1.ckpt \
    --vae_ckpt /path/to/sd-vae-ft-ema \
    --lre --cfg_scale 4.5 --color_align wavelet \
    --image_path /path/to/RealLQ250/lq \
    --npz_path /path/to/RealLQ250/npz \
    --save_dir validation

Evaluation on high-level benchmarks

Testing instructions for segmentation and detection can be found in their respective folders.

🪪 License

The provided code and pre-trained weights are licensed under the Apache 2.0 license.

🤗 Acknowledgement

This code is based on PixArt-α, BasicSR and RMT. Some code are brought from SeeSR, StableSR, DiffBIR and LLaVA. We thank the authors for their awesome work.

📧 Contact

If you have any questions, please feel free to reach me out at shallowdream555@gmail.com.

📖 Citation

If you find our work useful for your research, please consider citing our paper:

@article{ai2024dreamclear,
      title={DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation},
      author={Ai, Yuang and Zhou, Xiaoqiang and Huang, Huaibo and Han, Xiaotian and Chen, Zhengyu and You, Quanzeng and Yang, Hongxia},
      journal={Advances in Neural Information Processing Systems},
      year={2024}
}

smartyhouses/DreamClear