CIFAR10 image recognition using ResNet-18 architecture

Let's look into some more advanced concepts.

Learning Time

Grad-CAM: Gradient-weighted Class Activation Mapping

Convolutional Neural Network (CNN)-based models can be made more transparent by visualizing the regions of input that are "important" for predictions from these models - or visual explanations. Gradient-weighted Class Activation Mapping (Grad-CAM), uses the class-specific gradient information flowing into the final convolutional layer of a CNN to produce a coarse localization map of the important regions in the image.

Gradient-weighted Class Activation Mapping (GradCAM) uses the gradients of any target concept (say logits for 'dog' or even a caption), flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept. We take the final convolutional feature map, and then we weigh every channel in that feature with the gradient of the class with respect to the channel. It tells us how intensely the input image activates different channels by how important each channel is with regard to the class. It does not require any re-training or change in the existing architecture.

Objective

Train for 40 Epochs
Display 20 misclassified images
Display 20 GradCam output on the SAME misclassified images
Apply the following transforms while training:
- RandomCrop(32, padding=4)
- CutOut(16x16)
- Rotate(±5°)
Must use ReduceLROnPlateau
Must use LayerNormalization ONLY

Results

Model: ResNet18
Total Train data: 60,000 | Total Test Data: 10,000
Total Parameters: 11,173,962
Test Accuracy: 90.03%
Epochs: Run till 40 epochs
Normalization: Layer Normalization
Regularization: L2 with factor 0.0001
Optimizer: Adam with learning rate 0.001
Loss criterion: Cross Entropy
Scheduler: ReduceLROnPlateau
Albumentations:
1. RandomCrop(32, padding=4)
2. CutOut(16x16)
3. Rotate(5 degree)
4. CoarseDropout
5. Normalization
Misclassified Images: 1104 images were misclassified out of 10,000

Code Structure

resnet.py: This describes the ResNet-18 architecture with Layer Normalization
Referrence: https://github.com/kuangliu/pytorch-cifar/blob/master/models/resnet.py
utils: Utils code contains the following components:-
1. Data Loaders
2. Albumentations
3. Accuracy Plots
4. Misclassification Image Plots
5. Seed
main.py: Main code contains the following functions:-
1. Train code
2. Test code
3. Main function for training and testing the model
Colab file: The Google Colab file contains the following steps:-
1. Cloning the GIT Repository
2. Loading data calling the data loader function from utils file
3. Model Summary
4. Running the model calling the main file
5. Plotting Accuracy Plots
6. Plotting 20 Misclassification Images
7. Plotting the Gradcam for same 20 misclassified images

Arijit-datascience/pytorch_resnet18

CIFAR10 image recognition using ResNet-18 architecture

Learning Time

Grad-CAM: Gradient-weighted Class Activation Mapping

Objective

Results

Code Structure

Model Summary

Plots

Collaborators