DeepDetection_2

Pneumonia detection using a variety of Convolutional Neural Networks. Bayesian Classifiers will also be tested.

Overview

The repository contains various Neural Network Architectures used in the training of Pnemonia classification.

Code Flow

- configs.py # Centralizes all hyperparameters for models and augmentations

- DenseNet.py # Various DenseNet Model definitions
- ResNet.py # Various ResNet Model definitions
- VGG.py # Various VGG Model definitions
- VanillaCNN.py # Simple Convolutional Neural Network

- DataAug.py # generator for train/test 
- keras_tuner.py # Bayesian Optimization of Optimizer and Loss Functions 
- Visualization.py # Various visualization functions 
- EDA2.py # Inspection of Dataset

Data

The Pneumonia dataset was from Kaggle. The dataset contained 5,863 X-ray images. The dataset contained only 2 categories, Normal and Pneumonia. The dataset is roughly 3 GB in size.

The X-ray images are from patients 1-5 years old. All chest x-ray images were taken Anterior and Posterior.

The images come from a hospital in China, Guangzhou Women and Children’s Medical Center.

https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

Exploratory Data Analysis

A large class imbalance on the original dataset was noticed, in the train and test folders. Extensive data augmentation was needed to get accurate and reliable results. The images come in varying sizes. Of the 4 types of Pneumonia the dataset consisted of only Bacterial and Viral Pneumonia.

EDA2.ipynb

Data Augmentation

Implented a cohort of Data Augmentation strategies. I found it best to just use all of them. Using the Keras-Tuner library I did a Random Search as well as a Bayesian Optimization to see which if not all Augmentation strategies yield the highest accuracies.

Rescale = 1./255
Rotation Range = 45
ZCA Whitening
Zoom Range = 0.5
Horizontal Flip
Vertical Flip
Width Shift Range = .15
Height Shift Range = .15

python3 DataAug.py
python3 Keras_Tuner.py

Software

Tensorflow 2.1
Keras-Tuner
Matplotlib
Scikit Learn
Scipy
mlxtend
kerastuner

Model

DenseNet

python3 DenseNet.py

ResNet

python3 ResNet.py

VGG.py

Evaluation

Due to compute issues I was not able to train all the networks. I was also not able to implement a wide Bayesian Optimization search. Very little training was able to be done on my own. I had to cycle in between training locally on my laptop and Colab. In the future, I'm going to solve this by having a smaller proxy dataset and I am going to do all my preprocessing and Data Augmentation before my training, this was severely hampering the training time.

python3 Visualization.py

Next Steps

Implementing Class Activation Maps to see where in the X-ray image the CNN is looking towards.
Developing an ensemble of these CNN models
Creating a cache method to store dataset to speed up training of models. Pre-processing and Data Augmentation while training heavily relies on the CPU which takes away from the GPU benefits
Implenting a Focal Loss, more suitable for imbalanced datasets
Label Smoothing (LabelSmoothingCrossEntropy())
Batch Normalization
Test Rectified Adam, it slowly reduces the learning rate until the variance stabilizes itself
Cosine Anneal Learning Rate, this increases and decreases the learning rate which helps avoid(jump) smaller local optimas
Figuring out the image size that stores enough information while remaining the smallest possible image size
Adding an attention layer would be very interesting, they allow NN to capture longer range dependencies and focus in on more object shapes
Implementing more data augmentation strategies I believe is key, more important than the CNN architecture and dropout. Based on this paper, https://arxiv.org/abs/1806.03852v4
New Data Augmentation technique, Progressive Sprinkling. Outperforms all other techniques: Cutmix, Mixup, and Recap. It's just partial masking of certain areas of the image. It forces the network to seek more relevant areas of interest.
Curating and creating a smaller proxy dataset that can be representative of the entire dataset, for quick and rapid hyperparameter testing. Based on, https://arxiv.org/abs/1906.04887v1 (it would also be more balanced)
Implementing EfficentNet which uses techniques I believe we can add to existing NN architectures (DenseNet) to help improve their ability to detect certain classes of information

DeepMindv2/DeepDetection