Fidelity Estimation Improves Noisy-Image Classification with Pretrained Networks

IEEE Signal Processing Letters, 2021.

[Paper] - [Supplementary]

Abstract: Image classification has significantly improved using deep learning. This is mainly due to convolutional neural networks (CNNs) that are capable of learning rich feature extractors from large datasets. However, most deep learning classification methods are trained on clean images and are not robust when handling noisy ones, even if a restoration preprocessing step is applied. While novel methods address this problem, they rely on modified feature extractors and thus necessitate retraining. We instead propose a method that can be applied on a pretrained classifier. Our method exploits a fidelity map estimate that is fused into the internal representations of the feature extractor, thereby guiding the attention of the network and making it more robust to noisy data. We improve the noisy-image classification (NIC) results by significantly large margins, especially at high noise levels, and come close to the fully retrained approaches. Furthermore, as proof of concept, we show that when using our oracle fidelity map we even outperform the fully retrained methods, whether trained on noisy or restored images.

Degradation Model
Requirements
Model Training and Testing
Baseline Methods and Ablation Study
Results
Citation

Degradation Model

To explore the effects of degradation types and levels on classification networks, we also implement five types of degradation models: Additive white Gaussian noise (AWGN), Salt and Pepper Noise, Gaussian Blur, Motion Blur and Rectangle Crop. The instructions for those degradatin models are given in this notebook.

Requirements

Python 3.7, PyTorch 2.1.0;
Other common packages listed in requirements.txt or environment.yml.

Model Training and Testing

Training procedure

For the DnCNN denoiser, the parameter initialization follows He et al.. We change the $\ell_2$ loss function of the original paper to $\ell_1$ as it achieves better convergence performance. To train the classification networks, we fine-tune models pretrained on the ImageNet dataset. The fully connected layers are modified to fit the number of classes of each dataset (i.e. 257 for Caltech-256). We adopt the same initialization as He et al., i.e., the Xavier algorithm, and the biases are initialized to 0. We use the NAG descent optimizer with an initial learning rate of 0.001, and 120 training epochs. We also introduce a batch-step linear learning rate warmup for the first 5 epochs and a cosine learning rate decay, and apply label smoothing with $\varepsilon=0.1$ . We select the model with the highest accuracy on the validation set.

Train Pretrained Sub-models for the Proposed Models

The implemention of classification networks is taken from torchvision, and the restoration networks are based on DnCNN, MemNet.

To obtain pretrained classification networks:
python train.py --task classification --classification resnet50 --dataset caltech256 --num_class 257
- The --classification argument takes value in 'resnet50', 'resnet18', 'alexnet', 'googlenet', 'vgg';
- The --dataset and --num_class takes value in 'caltech256', 257 and 'caltech101', 101 respectively.
To obtain pretrained restoration networks:
python train.py --task=restoration --degradation=awgn --restoration=dncnn --level 0 0.5 --batch_size 256
- The --restoration argument takes value in 'dncnn', 'memnet'.
To obtain retrained classification networks on degraded images:
python train.py --task classification --classification resnet50 --degradation awgn --level 0 0.1 0.2 0.3 0.4 0.5
To obtain retrained classification networks on restored images:
python train.py --task classification --classification resnet50 --degradation awgn --level 0 0.1 0.2 0.3 0.4 0.5 --restoration dncnn
To obtain our pretrained fidelity map estimator:
python train.py --task fidelity --degradation awgn --restoration dncnn --level 0 0.5 --fidelity_input degraded --fidelity_output l1 --batch_size 256 --num_epochs 60
- The --fidelity_input argument takes value in 'degraded', 'restored';
- The --fidelity_output argument takes value in 'l1', 'l2', 'cos'.

Proposed Model

To train the proposed model:
python train.py --task model --mode oracle --classification resnet50 --degradation awgn --restoration dncnn --level 0 0.1 0.2 0.3 0.4 0.5 --fidelity_input degraded --fidelity_output l1 --num_epochs 60 --dataset caltech256 --num_class 257
- The --mode argument takes value in 'endtoend-pretrain', 'pretrain', 'oracle'
To test the proposed model:
python test.py --task model --mode oracle --classification resnet50 --degradation awgn --level 0.1 --restoration dncnn --fidelity_input degraded --fidelity_output l1 --is_ensemble True
- The --is_ensemble argument takes value in 'True', 'False'

Baseline Methods and Ablation Study

We provide four baseline methods for a comprehensive analysis. To train and test the baseline methods:
- WaveCNet
  - train: python train.py --task wavecnet --classification resnet50;
  - test: python test.py --task wavecnet --classification resnet50 --degradation awgn --level 0.1;
- DeepCorrect
  - train: python train.py --task deepcorrect --classification resnet50 --degradation awgn --level 0 0.1 0.2 0.3 0.4 0.5 --num_epochs 60;
  - test: python test.py --task deepcorrect --classification resnet50 --degradation awgn --level 0.1.
We also provide some in-depth analysis and ablation study models:
- To try different fidelty map inputs and outputs, you can use the --fidelity_input and --fidelity_output arguments;
- To try different downsampling methods, you can use the --downsample argument which takes value in 'bicubic', 'bilinear', 'nearest';
- For ablation study, you can use the --ablation argument which takes value in 'spatialmultiplication' 'residualmechanism' 'spatialaddition' 'channelmultiplication' 'channelconcatenation';
- Note: For more details on the ablation study models, please refer to our paper.

Results

Aside from the results in our main paper and supplementary material, we also illustrate the performance of the proposed method on other classification (e.g. AlexNet in the figure below on the left) and restoration networks (e.g. MemNet in the figure below on the right). The performance of the proposed method on other networks parallels that on ResNet-50 and DnCNN in our paper. This shows that the proposed method is model-agnostic and can be used on other networks.

The above figure on the left: Classification results with the AlexNet classification network and the DnCNN restoration network, on the Caltech-256 dataset, for various setups. The solid lines indicate testing directly on noisy images. The dashed lines indicate testing with the DnCNN restoration preprocessing step.

The above figure on the right: Classification results with the ResNet-50 classification network and the MemNet restoration network, on the Caltech-256 dataset, for various setups. The solid lines indicate testing directly on noisy images. The dashed lines indicate testing with the MemNet restoration preprocessing step.

Extended Experimental Results (CUB-200-2011)

The CUB-200-2011 dataset is an image dataset of 200 bird species. There are 5994 training images and 5794 test images. We randomly chose 20 percent of the training set for validation. The results are given in the table below.

Methods	Experimental results	Uniform degradation (sigma)
Methods	Experimental results	0.1	0.2	0.3	0.4	0.5
Pretrained	Test on noisy	34.89	08.11	02.02	00.89	00.70
Pretrained	Test on restored	56.77	42.37	30.97	23.15	16.91
Retrain on noisy	Test on noisy	59.91	55.86	51.41	46.94	42.09
Retrain on noisy	Test on restored	58.37	52.56	44.97	37.76	31.33
Retrain on restored	Test on noisy	52.53	24.46	07.18	01.86	00.85
Retrain on restored	Test on restored	63.34	59.51	54.76	49.83	44.63
FG-NIC (Pretrained)	Single	63.75	56.98	48.87	40.55	32.82
FG-NIC (Pretrained)	Ensemble	64.95	57.37	48.74	40.33	32.38
FG-NIC (Oracle)	Single	65.10	60.21	55.26	50.77	46.10
FG-NIC (Oracle)	Ensemble	65.75	60.95	55.75	51.15	46.32

Citation

@article{lin2021fidelity,
    title={Fidelity Estimation Improves Noisy-Image Classification with Pretrained Networks}, 
    author={Xiaoyu Lin and Deblina Bhattacharjee and Majed El Helou and Sabine Süsstrunk},
    journal={IEEE Signal Processing Letters},
    year={2021},
    publisher={IEEE}
}

pixel-lt/FG-NIC