/DreamClear

[NeurIPS 2024πŸ”₯] DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

Primary LanguagePythonApache License 2.0Apache-2.0

DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

Yuang Ai1,2  Xiaoqiang Zhou1,4  Huaibo Huang1,2  Xiaotian Han3  Zhengyu Chen3  Quanzeng You3  Hongxia Yang3
1MAIS & NLPR, Institute of Automation, Chinese Academy of Sciences 
2School of Artificial Intelligence, University of Chinese Academy of Sciences 
3ByteDance, Inc 4University of Science and Technology of China 
NeurIPS 2024

⭐ If DreamClear is helpful to your projects, please help star this repo. Thanks! πŸ€—

πŸ”₯ News

  • 2024.11.30: Release more convenient inference code for your own images.
  • 2024.10.25: Release segmentation&detection code, pre-trained models.
  • 2024.10.25: Release RealLQ250 benchmark, which contains 250 real-world LQ images.
  • 2024.10.25: Release training&inference code, pre-trained models of DreamClear.
  • 2024.10.24: This repo is created.

πŸ“Έ Real-World IR Results

πŸ”§ Dependencies and Installation

  1. Clone this repo and navigate to DreamClear folder

    git clone https://github.com/shallowdream204/DreamClear.git
    cd DreamClear
  2. Create Conda Environment and Install Package

    conda create -n dreamclear python=3.9 -y
    conda activate dreamclear
    pip3 install -r requirements.txt
  3. Download Pre-trained Models (All models except for llava can be downloaded at Huggingface for convenience.)

    Base Model:

    Ours provided Model:

🎰 Train

I - Prepare training data

Similar to SeeSR, We pre-prepare HQ-LQ image pairs for the training of IR model. Run the following command to make paired data for training:

python3 tools/make_paired_data.py \
--gt_path gt_path1 gt_path2 ... \ 
--save_dir /path/to/save/folder/ \
--epoch 1 # number of epochs to generate paired data

After generating paired data, you can use MLLM (e.g., LLaVA) to generate detailed text prompt for HQ images. Then you need to use T5 to extract text features in order to save training time. Run:

python3 tools/extract_t5_features.py \
--t5_ckpt /path/to/t5-v1_1-xxl \
--caption_folder /path/to/caption/folder \
--save_npz_folder /path/to/save/npz/folder

Finally, the directory structure for training datasets should look like

training_datasets_folder/
    └── gt
        └── 0000001.png # GT , (1024, 1024, 3)
        └── ...
    └── sr_bicubic
        └── 0000001.png # LQ + bicubic upsample, (1024, 1024, 3)
        └── ...
    └── caption
        └── 0000001.txt # Caption files (not used in training)
        └── ...
    └── npz
        └── 0000001.npz # T5 features
        └── ...

II - Training for DreamClear

Run the following command to train DreamClear with default settings:

python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=... --node_rank=... --master_addr=... --master_port=... \
    train_dreamclear.py configs/DreamClear/DreamClear_Train.py \
    --load_from /path/to/PixArt-XL-2-1024-MS.pth \
    --vae_pretrained /path/to/sd-vae-ft-ema \
    --swinir_pretrained /path/to/general_swinir_v1.ckpt \
    --val_image /path/to/RealLQ250/lq/val_image.png \
    --val_npz /path/to/RealLQ250/npz/val_image.npz \
    --work_dir experiments/train_dreamclear

Please modify the path of training datasets in configs/DreamClear/DreamClear_Train.py. You can also modify the training hyper-parameters (e.g., lr, train_batch_size, gradient_accumulation_steps) in this file, according to your own GPU machines.

⚑ Inference

We provide the RealLQ250 benchmark, which can be downloaded from Google Drive.

Testing DreamClear for Image Restoration

Run the following command to restore LQ images (the code defaults to using 2 GPUs for inference):

python3 -m torch.distributed.launch --nproc_per_node 1 --master_port 1234 \
    test.py configs/DreamClear/DreamClear_Test.py \
    --dreamclear_ckpt /path/to/DreamClear-1024.pth \
    --swinir_ckpt /path/to/general_swinir_v1.ckpt \
    --vae_ckpt /path/to/sd-vae-ft-ema \
    --t5_ckpt /path/to/t5-v1_1-xxl \
    --llava_ckpt /path/to/llava-v1.6-vicuna-13b \
    --lre --cfg_scale 4.5 --color_align wavelet \
    --image_path /path/to/input/images \
    --save_dir validation \
    --mixed_precision fp16 \
    --upscale 4

Evaluation on high-level benchmarks

Testing instructions for segmentation and detection can be found in their respective folders.

πŸͺͺ License

The provided code and pre-trained weights are licensed under the Apache 2.0 license.

πŸ€— Acknowledgement

This code is based on PixArt-Ξ±, BasicSR and RMT. Some code are brought from SeeSR, StableSR, DiffBIR and LLaVA. We thank the authors for their awesome work.

πŸ“§ Contact

If you have any questions, please feel free to reach me out at shallowdream555@gmail.com.

πŸ“– Citation

If you find our work useful for your research, please consider citing our paper:

@inproceedings{ai2024dreamclear,
    title={DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation},
    author={Yuang Ai and Xiaoqiang Zhou and Huaibo Huang and Xiaotian Han and Zhengyu Chen and Quanzeng You and Hongxia Yang},
    booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
    year={2024},
    url={https://openreview.net/forum?id=6eoGVqMiIj}
}