/DocShadow-SD7K

[ICCV 2023] A large-scale high-resolution dataset satisfies all important data features about document shadow, covers a large number of document shadow images.

Primary LanguagePythonMIT LicenseMIT

📋 High-Resolution Document Shadow Removal

High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net.

Zinuo Li 👨‍💻‍ , Xuhang Chen 👨‍💻‍ , Chi-Man Pun 📮 and Xiaodong Cun 📮 ( 👨‍💻‍ Equal contributions, 📮 Corresponding )

University of Macau

In International Conference on Computer Vision 2023 (ICCV 2023)

🎉 Important news

01/05/2024: We compressed the SD7K as much as possible while ensuring image quality remains unchanged. The resolution stays the same (2K), but the size is now only 9% of the original. It would now be more convenient to follow and play with. The Dataset indicates the compressed one (9G), the Original Dataset means the original size (100G).

🔮 Dataset

If you are using HPC, we highly recommend you to download SD7K via OpenXLab. For downloading the compared Kligler and Jung datasets, please refer to Kligler and Jung.

SD7K is a large-scale high-resolution dataset that satisfies all important data features about document shadow currently, which covers a large number of document shadow images.

We use over 30 types of occluders along with more than 350 documents to contribute to the dataset. These occluders have the shape of both regular and irregular forms, which provides adequate coverage for various situations. For more information, you can refer to the demo and paper.

⚙️ Usage

Installation

git clone https://github.com/CXH-Research/DocShadow-SD7K.git
cd DocShadow-SD7K
pip install -r requirements.txt

Training

You may download the dataset first, and then specify TRAIN_DIR, VAL_DIR and SAVE_DIR in the section TRAINING in config.yml.

For single GPU training:

python train.py

For multiple GPUs training:

accelerate config
accelerate launch train.py

If you have difficulties with the usage of accelerate, please refer to Accelerate.

Inference

Please download our pre-trained models here and specify TRAIN_DIR, VAL_DIR and SAVE_DIR in section TESTING in config.yml, then execute:

python infer.py

Compared to the original version in our paper, we used GDFN instead and achieved better performance, with an average improvement of 1-2 points in each metric.

For the results of all baselines and our results on SD7K, please refer here.

💗 Acknowledgements

We would like to thank DocShadow-ONNX-TensorRT for the implementation of our work. If you are looking for easier implementation, please refer to them. We also appreciate the great open-source datasets, please refer to Kligler and Jung for downloading.

🛎 Citation

If you find our work helpful for your research, please cite:

@InProceedings{Li_2023_ICCV,
    author    = {Li, Zinuo and Chen, Xuhang and Pun, Chi-Man and Cun, Xiaodong},
    title     = {High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {12449-12458}
}