/Illumination-Adaptive-Transformer

🌕 [BMVC 2022] You Only Need 90K Parameters to Adapt Light: A Light Weight Transformer for Image Enhancement and Exposure Correction. SOTA for low light enhancement, 0.004 seconds try this for pre-processing.

Primary LanguagePythonApache License 2.0Apache-2.0

You Only Need 90K Parameters to Adapt Light: a Light Weight Transformer for Image Enhancement and Exposure Correction. (BMVC 2022) (paper) (zhihu中文解读)

2023.12.15: Our new work Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption has been accepted by AAAI 2024, please refer (here) if you interest in NeRF under low-light~

2023.9.26: Upload the new detection benchmark on EXDark dataset, see detection part page.

2023.5.11: Thanks for this issue's correction, the Flops of IAT on (256 x 256) image is 1.44 GFlops, and the Flops of IAT on (400 x 600) image is 5.28 GFlops, please notice. Also the 0.004s per image inference speed is calculated in LOL-V2 dataset inference (100 images total), if you only test single image, the inference speed would be slow down because of GPU's effect (the initial images evaluation time would be larger).

2023.3.2: Renew the img_demo.py, you can directly use this for image enhancement and exposure correction.

2022.10.16: We add demos for low-light enhancement and exposure correction in Hugging Face Spaces.

2022.10.11: Upload the low-light semantic segmentation code. See segmentation.

2022.10.1: Papar accepted by BMVC 2022!

2022.8.10: Upload LOL-V1 dataset training code.

2022.8.3: Upload the new arxiv version, the rewnewed results on LOL-V1 dataset (485 training images, 15 testing images) is 23.38 PSNR and 0.809 SSIM, the results on LOL-V2-real dataset (689 training images, 100 testing images) is 23.50 PSNR and 0.824 SSIM. Detail see this issue.

2022.7.11: Upload the low-light object detection code. See detection.


Reference:

Detection and Segmentation are use mmdetection and mmsegmentation, some of the code are borrow from Zero-DCE and UniFormer, thanks them both so much!

If this code or paper help you, please cite as follow, thx~

@inproceedings{Cui_2022_BMVC,
author    = {Ziteng Cui and Kunchang Li and Lin Gu and Shenghan Su and Peng Gao and ZhengKai Jiang and Yu Qiao and Tatsuya Harada},
title     = {You Only Need 90K Parameters to Adapt Light: a Light Weight Transformer for Image Enhancement and Exposure Correction},
booktitle = {33rd British Machine Vision Conference 2022, {BMVC} 2022, London, UK, November 21-24, 2022},
publisher = {{BMVA} Press},
year      = {2022},
url       = {https://bmvc2022.mpi-inf.mpg.de/0238.pdf}
}

Abstract

Challenging illumination conditions (low-light, under-exposure and over-exposure) in the real world not only cast an unpleasant visual appearance but also taint the computer vision tasks. After camera captures the raw-RGB data, it renders standard sRGB images with image signal processor (ISP). By decomposing ISP pipeline into local and global image components, we propose a lightweight fast Illumination Adaptive Transformer (IAT) to restore the normal lit sRGB image from either low-light or under/over-exposure conditions. Specifically, IAT uses attention queries to represent and adjust the ISP-related parameters such as colour correction, gamma correction. With only ~90k parameters and ~0.004s processing speed, our IAT consistently achieves superior performance over SOTA on the current benchmark low-light enhancement and exposure correction datasets. Competitive experimental performance also demonstrates that our IAT significantly enhances object detection and semantic segmentation tasks under various light conditions.

For Vision Tasks under various lighting conditions, towards both Human Vision 😄 and Machine Vision 📷

5 Tasks Under Various Lighting Conditions: 1. Low-light Enhancement (LOL, MIT5K) // 2. Exposure Correction // 3. Low-Light Object Detection // 4. Low-Light Semantic Segmentation // 5. Various-Light Object Detection

Figure 1: IAT (illumination-adaptive-transformer) for multi light conditions vision tasks, and the comparision results on LOL-V1 dataset.


Model Structure:

Figure 2: Model Structure of Illumination Adaptive Transformer.

Our IAT model consist of two individual branches, the local branch is for pixel-wise adjustment and ouputs two feature map for add and multiply. The global branch is for global-wise adjustment and outputs the color matrix and gamma value, global branch is inspired by DETR, the network would updates color matrix and gamma value by a dynamic query learning way. BTW, the total model is only over 90k+ parameters and the inference speed is only 0.004s per image on LOL dataset (single Nvidia-3090 GPU).


Usage:

Enviroment (install pytorch 1.7.1 or later, following pytorch.):

$ conda create -n IAT python==3.7.0
$ conda activate IAT
$ conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0
$ pip install timm matplotlib IQA_pytorch tqdm

For low-level vision (low-light enhancement, exposure correction):

cd IAT_enhance

For high-level vision (low-light detection, low-light semantic segmentation):

cd IAT_high

Demo:

Figure 3: IAT in low-light enhancement (LOL dataset, MIT-5K dataset).

Figure 4: IAT in exposure correction (Exposure dataset).

Figure 5: IAT in low-light detection (EXDark Dataset). Background image is the image generated by IAT while joint-training.


Related

We also have another work about the low-light object detection, ICCV 2021: Multitask AET with Orthogonal Tangent Regularity for Dark Object Detection (code) (paper), please read if you interest!