Chest X-Ray (Pneumonia) Binary Classification:
- CNN with Transfer Learning
(Tensorflow 2.0 & Keras)

1. | Introduction ๐Ÿ‘‹

  • Problem Overview ๐Ÿ‘จโ€๐Ÿ’ป
    • ๐Ÿ‘‰ The goal of this notebook is to determine which samples are from patients with Pneumonia.
    • ๐Ÿ‘‰ The objective: train a convolutional neural network (CNN) able to successfully classify the chest X-ray images whether the result is NORMAL or PNEMONIA.
    • ๐Ÿ‘‰ Therefore, it is a binary classification.
  • Dataset Description ๐Ÿค”
    • ๐Ÿ‘‰ The Chest X-ray Images Dataset is taken from Kaggle Dataset, Chest X-ray Images.
    • ๐Ÿ‘‰ This Dataset provides train folder and test folder with inside each folder has both NORMAL folder and PNEUMONIA folder respectively..
    • Chest X-ray images (anterior-posterior) were selected from retrospective cohorts of pediatric patients of one to five years old from Guangzhou Women and Children's Medical Center, Guangzhou. All chest X-ray imaging was performed as part of patients' routine clinical care.
    • For the analysis of chest x-ray images, all chest radiographs were initially screened for quality control by removing all low quality or unreadable scans.
    • The diagnoses for the images were then graded by two expert physicians before being cleared for training the AI system. In order to account for any grading errors, the evaluation set was also checked by a third expert.
  • Analysis Introduction ๐Ÿ”Ž
    • ๐Ÿ‘‰ In our case, with using ImageNet dataset, with more than 14 million images have been hand-annotated by the project to indicate what objects are pictured, in at least one million of the images, bounding boxes are also provided and contains more than 20,000 categories consisting of several hundred images for 1000 classes. This means that we can pick any CNN trained using ImageNet to get a warm start at training our own model.
    • ๐Ÿ‘‰ ResNet50 is a somewhat old, but still very popular, CNN. Its popularity come from the fact that it was the CNN that introduced the residual concept in deep learning. It also won the ILSVRC 2015 image classification contest. Since it is a well-known and very solid CNN, we decided to use it for our transfer learning task.
    • ๐Ÿ‘‰ As the original ResNet50V2 was trained on ImageNet, its last layer outputs 1000 probabilities for a tested image to belong to the 1000 different ImageNet classes. Therefore, we cannot directly use it in our binary classification problem with only chest X-ray NORMAL and PNEUMONIA as classes.
    • ๐Ÿ‘‰ Here we try 2 approaches from transfer learning and compare their performances:
      • using a pretrained model with frozing the base ResNet50V2 weights of its fully connected layers as the base for feature extraction and then add new layers and train them without changing anything in the convolutional section of the network.
      • Fine Tuning, unfreezing the last layers of the pretrained model and then add new layers and train them in the convolutional section of the network.
    • ๐Ÿ‘‰ In this case, the convolutional section becomes just an image feature extractor and the actual job of classifying the features is performed by the newly added fully connected layers.
  • Methods ๐Ÿงพ
    • Load a ResNet50V2 model trained using the ImageNet dataset.
    • ๐Ÿ‘‰ Preprocess Images with Keras Image Data Generator,
      • can rescale the pixel values
      • can apply random transformation techniques for data augmentation on the fly.
      • define two different generators,
      • The train_datagen includes some transformations to augment the train set.
      • The val_datagen is used to simply rescale the validation and test sets.
      • apply those generators on each dataset using the flow_from_dataframe method.
      • Apart from the transformations defined in each generator, the images are also resized based on the target_size set..
    • During training process, we have applied some techniques like
      • 1st approach: Transfer Learning ResNet50V2 with all Frozen Fully Connected Layers
        • using image size of (224, 224, 3), BATCH_SIZE=128 and EPOCH=50.
        • adding global average pooling 2D layers to reduces the spatial dimensions to 1x1 while retaining the depth.
        • adding 10% dropout on input activation layers
        • utilizing Adam optimizer
        • monitoring validation loss using EarlyStopping at patience=5
        • monitoring validation loss using ReduceLROnPlateau at patience=2
      • 2nd approach: Transfer Learning ResNet50V2 with Fine-Tuning Selected Fully Connected Layers
        • using image size of (224, 224, 3) and EPOCH=50.
        • reducing the BATCH_SIZE from 128 to 32.
        • adding batch normalization layers to standardize the input and normalize hidden units of each prior layer of activation layers of a neural network by adjusting and scaling the activations and help reduce problem of covariant shift.
        • adding global average pooling 2D layers to reduces the spatial dimensions to 1x1 while retaining the depth.
        • adding 10% dropout on input activation layers
        • applying learning scheduler on Adam optimizer
        • monitoring validation loss using EarlyStopping with changing patience 5 to 15.
        • monitoring validation loss using ReduceLROnPlateau with changing patience 2 to 5.
    • Lastly, we can use the trained ResNet52V2 to predict the class of the preprocessed image.

2. | Accuracy of Best Model ๐Ÿงช

  • Transfer Learning ResNet50V2 with all Frozen Fully Connected Layers
    • Training Accuracy achieved: 93.95%
    • Validation Accuracy achieved: 95.12%
    • Test F1 Score: 94.0%
  • Transfer Learning ResNet50V2 with Fine-Tuning Selected Fully Connected Layers
    • Training Accuracy achieved: 97.94%
    • Validation Accuracy achieved:96.65%
    • Test F1 Score: 97.6%

3. | Conclusiion ๐Ÿ“ค

  • In this study respectively,
  • With this transferred ResNet50V2, we can perform tests using any images having 224x224 resolution.
  • The fine-tuning approach had reached the best score.
  • Both 1st approach and 2nd approach trained models were quite generalized as their the differences between train loss and validation loss are small.
  • The 2nd approach obviously outperforms 1st approach with 2nd approach's f1_score=97.6% and 1st approach's f1 score=94.0% respectively.

    This further proves although we can make use of Resnet50V2 transfer learning's weights for higher start, greater slope and greater asymptotes, still needed with fine-tuning layers to adapt domains' information that not seen by models trained on 'imagenet' to reach better performance.

  • The recall was close to 100%.
  • Even without expertise on the medical field, it's reasonable to assume that false negatives are more 'costly' than false positives in this case.
  • Reaching such recall with a relatively small dataset for training as this one, while also reaching a pretty good recall, is a good indicative of the model's capabilities confirmed by the high ROC-AUC value and double confirmed by high AUC value under precision recall curve for this small and unbalanced dataset.
  • Correct predictions on some test images samples.

5. | Reference ๐Ÿ”—