X-ray image classification (Pneumonia detection in pediatric patients) with TensorFlow

Problem description

Pneumonia is an infection that inflames the air sacs in one or both lungs, causing cough with phlegm or pus, fever, chills, and difficulty breathing. A variety of organisms, including bacteria, viruses and fungi, can cause pneumonia.

Key facts provided by the WHO

Pneumonia accounts for 14% of all deaths of children under 5 years old, killing 740,180 children in 2019.
Pneumonia can be caused by viruses, bacteria or fungi.
Pneumonia can be prevented by immunization, adequate nutrition, and by addressing environmental factors.
Pneumonia caused by bacteria can be treated with antibiotics, but only one third of children with pneumonia receive the antibiotics they need.

Being able to accurately detect pneumonia in pediatric patients is a live and death procedure, in which being able to act fast can increase the survival chances.

In this project, we evaluate pneumonia in pediatric patients by using deep-learning techniques with TensorFlow that allows to create a classification model.

Warning: One common mistake is to try to predict an adult x-ray image, which lead to wrong results. We would demonstrate at the demonstration section.

Data

The data were obtained from Kaggle datasets under the name: Chest X-Ray Images (Pneumonia). The weight of the folder is 2GB which cannot be uploaded to GitHub, but it can be downloaded using the Download button in the top right corner, or by using the step-by-step guide provided here to download the data using kaggle keys provided in this link.

The following is the same description provided in the kaggle dataset about details of the data

The data contains three folders (Train, test, val) containing subfolders for each image category (Pneumonia/Normal). There are 5,863 X-Ray images (JPEG) and 2 categories (Pneumonia/Normal).

Chest X-ray images (anterior-posterior) were selected from retrospective cohorts of pediatric patients of one to five years old from Guangzhou Women and Children’s Medical Center, Guangzhou. All chest X-ray imaging was performed as part of patients’ routine clinical care.

For the analysis of chest x-ray images, all chest radiographs were initially screened for quality control by removing all low quality or unreadable scans. The diagnoses for the images were then graded by two expert physicians before being cleared for training the AI system. In order to account for any grading errors, the evaluation set was also checked by a third expert.

Model

The prediction model provides a probability to predict that a pediatric patient x-ray shows signs of bacterial or virus pneumonia. The threshold was set at 0.8 to have more confidence in positive cases, however this might not be the ideal solution, but it would help to demonstrate some aspects of the prediction.

Structure of the repository

images: Example images for the classification model (you can use the URL to run the prediction service)

Demonstration_notebook.ipynb: Notebook that runs the image classification model using the prediction service

Dockerfile: For deployment of the model in AWS as lambda function.

Notebook.ipynb: Notebook for exploratory data analysis, creating and exporting the model.

Pipfile and Pipfile.loc: contains the dependencies to run the project.

pneumonia-class.tflite: Model with TensorFlow lite

process_data.py: Python script to process an url with the image and return a prediction

test.py: Python script to test the prediction service using AWS.

How to run

Clone the repo
Download the data from kaggle
Install the dependencies

pipenv install

Activate the virtual enviroment

pipenv shell

Building the prediction model and service

Run the train.py file to obtain the best model for the training parameters as a .h5 file and convert to tflite file.

To make easier for you to run the training file you can go to this kaggle notebook that replicates the train.py file, so you don't need to download the data

Run the docker file:

First build the model:

docker build -t pneumonia-model .

Run the docker image

docker run -it --rm -p 8080:8080 pneumonia-model:latest

Run the prediction service: Open a new command line (make sure you are running the docker file)

python test.py

The test.py already have an x-ray image link to return a prediction.

you can change the link to make a different prediction (some times do not work to take the link directly, you can just take a screenshot and upload to github

Deployment

Cloud deployment

AWS

pre-requisets needs to have AWS CLI installed which is command line to interact with AWS ( I have a windows and working with WSL, so I download the cli using the linux command)

Elastic Container Registry:

Place to store your container

Create repo View push command

Go to security credentials and find the access key to configure your AWS

run in your command line: aws configure and type your credentials from the above step

run:

Create the repo to store the image

aws ecr create-repository --repository-name pneumonia-class-images

Obtain the URI of the

xxxxxx2.dkr.ecr.us-west-2.amazonaws.com/pneumonia-class-images

Set at the command line

$(aws ecr get-login --no-include)

ACCOUNT=xxxxxxx

REGION=us-west-2

REGISTRY=pneumonia-class-images

PREFIX=${ACCOUNT}.dkr.ecr.${REGION}.amazonaws.com/${REGISTRY}

TAG=pneumonia-class-model-v1-001

REMOTE_URI=${PREFIX}:${TAG}

Push the docker image to AWS

docker tag penumonia-model:latest ${REMOTE_URI}
docker push ${REMOTE_URI}

Create the lambda function

Browse the image

For deep learning task we need to increase the time of the response and the memory allocated to perform the function.

We need to go configuration -> General configuration and change the timeout to 30 seconds and the memory to 1024

Create the method and post

Use API Gateaway
Select the POST method
Integration type: lambda
Select the lambda function

Deploy the endpoint

Go to actions and click on deploy

Now we just need to obtain the URL and add predict at the end:

Demonstration

Let's use some examples to demonstrate how the AWS lambda function service work:

The Demonstration_notebook.ipyb shows how to run the predictions service.

The demonstration was made from independent articles that wasn't part of the training or testing example, but with similar characteristics of the patients using in the training.

First demonstration

Link to the image

Second demonstration