Learning A Sparse Transformer Network for Effective Image Deraining (CVPR 2023)

Xiang Chen, Hao Li, Mingqiang Li, and Jinshan Pan

Abstract: Transformers-based methods have achieved significant performance in image deraining as they can model the non-local information which is vital for high-quality image reconstruction. In this paper, we find that most existing Transformers usually use all similarities of the tokens from the query-key pairs for the feature aggregation. However, if the tokens from the query are different from those of the key, the self-attention values estimated from these tokens also involve in feature aggregation, which accordingly interferes with the clear image restoration. To overcome this problem, we propose an effective DeRaining network, Sparse Transformer (DRSformer) that can adaptively keep the most useful self-attention values for feature aggregation so that the aggregated features better facilitate high-quality image reconstruction. Specifically, we develop a learnable top-k selection operator to adaptively retain the most crucial attention scores from the keys for each query for better feature aggregation. Simultaneously, as the naive feed-forward network in Transformers does not model the multi-scale information that is important for latent clear image restoration, we develop an effective mixed-scale feed-forward network to generate better features for image deraining. To learn an enriched set of hybrid features, which combines local context from CNN operators, we equip our model with mixture of experts feature compensator to present a cooperation refinement deraining scheme. Extensive experimental results on the commonly used benchmarks demonstrate that the proposed method achieves favorable performance against state-of-the-art approaches. The source codes are available at https://github.com/cschenxiang/DRSformer.

Network Architecture

Datasets

Dataset	Rain200L	Rain200H	DID-Data	DDN-Data	SPA-Data
Baidu Cloud	Download (s2yx)	Download (z9br)	Download (5luo)	Download (ldzo)	Download (yjow)

Here, these datasets we provided are fully paired images, especially SPA-Data.

Training

Please download the corresponding training datasets and put them in the folder Datasets/train. Download the testing datasets and put them in the folder Datasets/test.
Note that we do not use MEFC for training Rain200L and SPA-Data, because their rain streaks are less complex and easier to learn. Please modify the file DRSformer_arch.py.
Follow the instructions below to begin training our model.

cd DRSformer
bash train.sh

Run the script then you can find the generated experimental logs in the folder experiments.

Testing

Please download the corresponding testing datasets and put them in the folder test/input. Download the corresponding pre-trained models and put them in the folder pretrained_models.
Note that we do not use MEFC for training Rain200L and SPA-Data, because their rain streaks are less complex and easier to learn. Please modify the file DRSformer_arch.py. See the file DRSformer_arch_200L+SPA.py.
Follow the instructions below to begin testing our model.

python test.py --task Deraining --input_dir './test/input/' --result_dir './test/output/'

Run the script then you can find the output visual results in the folder test/output/Deraining.

Pre-trained Models

Dataset	Rain200L	Rain200H	DID-Data	DDN-Data	SPA-Data
Baidu Cloud	Download (kzj5)	Download (j10m)	Download (nact)	Download (hj6r)	Download (vfvt)
Google Drive	Download	Download	Download	Download	Download

Performance Evaluation

See folder "evaluations"

for Rain200L/H and SPA-Data datasets: PSNR and SSIM results are computed by using this Matlab Code.
for DID-Data and DDN-Data datasets: PSNR and SSIM results are computed by using this Matlab Code.

Please note that Table 1 above is our final camera-ready version. There exists the slight gap between the final version and the arXiv version due to errors caused by different testing devices and environments. It is recommended that you can download the visual deraining results and retest the quantitative results on your own device and environment.

Visual Deraining Results

Dataset	Rain200L	Rain200H	DID-Data	DDN-Data	SPA-Data
DualGCN	DWL (v8qy)	DWL (jnc9)	DWL (3gdx)	DWL (1mdx)	DWL (lkeb)
SPDNet	DWL (y39h)	DWL (mry2)	DWL (klci)	DWL (19bm)	DWL (dd98)
Uformer	-	-	DWL (4uur)	DWL (39bj)	-
Restormer	DWL (6a2z)	DWL (9m1r)	DWL (1hql)	DWL (crj4)	DWL (b40z)
IDT	DWL (v4yd)	DWL (77i4)	DWL (8uxx)	DWL (0ey6)	DWL (b862)
Ours	DWL (hyuv)	DWL (px2j)	DWL (t879)	DWL (9vtz)	DWL (bl4n)

For DualGCN, SPDNet, Restormer and IDT, we retrain their models provided by the authors if no pretrained models are provided, otherwise we evaluate them with their online codes. For Uformer, we refer to some reported results in IDT. Noted that since the PSNR/SSIM codes used to test DID-Data and DDN-Data in their paper are different from ours, we retrain the Uformer on the DID-Data and DDN-Data. For other previous methods, we refer to reported results in here with same PSNR/SSIM codes.

Citation

If you are interested in this work, please consider citing:

@InProceedings{Chen_2023_CVPR,
    author={Chen, Xiang and Li, Hao and Li, Mingqiang and Pan, Jinshan}, 
    title={Learning a Sparse Transformer Network for Effective Image Deraining},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month={June},
    year={2023},
    pages={5896-5905}
}

Acknowledgment

This code is based on the Restormer. Thanks for their awesome work.

Contact

Should you have any question or suggestion, please contact chenxiang@njust.edu.cn.