We are a group of data scientists attempting to predict pneumonia based on X-ray scans. We hope that the charity Stop Pneumonia will adopt our model and deploy it to assist doctors and help reduce misdiagnosis of pneumonia. This will in turn improve treatment and health outcomes.
Pneumonia is a bacterial, viral or fungal infection of the lungs. It is very serious condition where the air sacs fill with liquid. This disease acounts for 14% of all deaths of children under 5 years old. Pneumonia is an infectious disease with a high mortality rate. It claimed 2.5 million lives in 2019, including 672,000 children under the age of five. That’s one person dying every 13 seconds. This data underscores a critical need to effectively diagnose patients as early as possible for effective intervention and treatment strategies.
The data was aquired from Mendeley. The data contained over 5000 images of pneumonia of children aged 1-5. The images were screened prior to inclusion and poor quality xrays were removed. The images showed anterior and postior xrays of the chest cavity. The data was organized into train and test sets. https://data.mendeley.com/datasets/rscbjbr9sj/3
We used a convolutional neural network (CNN) model to create predictions based on the X-rays of patient's lungs. We iterated through many different activation functions including relu, softmax, and tanx, with different amounts of hidden nodes to optimize our model. Our best model used the 'relu' activation function.
We evaluated our models based on accuracy and recall metrics. Our model was somewhat prone to overfitting, but the relu model minimized overfitting while giving us higher metrics.
Here we display a model where the loss function between the train and testing sets are minimized, and makes predictions against unseen data with 87.2% accuracy. While these results are consistent, the data scientists working on this project are limited by their PC's computing power. Future work would increase the image sizes for greater resolution and therefore more accurate predictions. Additionally, having a larger data set would also benefit the model to perform with greater accuracy. Having an accuracy rate that is as high as possible is ideal for high stakes predictive models such as this.
If adopted our model will Stop Pneumonia's charity to reduce the occurance of false negatives in pnemonia diagnosis and improve health outcomes. With greater computional power and more data, this model can be further optimized to increase its accuracy. We also recommend increasing funding to charities fighting Pneumonia. Finally, efforts should be in place to continue pursuing technological solutions to misdiagnosis issues.
https://github.com/meowkaiser/Pneumonia-Image-Recognition/blob/main/Slides.pdf
├── README.md <- The top-level README for reviewers of this project
├── Pneumonia_Imaging_tanh_models.ipynb <- Tanu was our first model, but not effective
├── Pneumonia_Imaging_softmax_model.ipynb <- Softmax was an improvement
├── RELU Model.ipynb <- RELU was our best model
├── Final_model_validation.ipynb <- Concise summary of the project with all data science steps
├── Slides.pdf <- PDF version of project presentation
├── Data <- Both sourced externally and generated from code, includes exploratory notebooks
└── Images <- Both sourced externally and generated from code