Health_Care_App

Background
Objective
Tools and Packages
Data Visualization
Results
References
Challenges and Future Work

BACKGROUND

Machine learning is a subset of artificial intelligence utilizing mathematical and statistical methods to identify patterns in data in an automated fashion. Numerous aspects of clinical practice lend themselves to computational tools to assess disease pathology, identify anomalies, triage critical patients, and various other tasks, but the scope of this article is limited to supervised learning to constrain the discussion, develop concrete examples, and because it represents the majority of clinical machine learning research.

In the context of supervised machine learning, models are fit to data, thereby learning relationships between input features and output targets. Input data represent digital encodings of, for example, X-rays, lab tests, electrocardiograms, or various other clinical data streams. The output could be a diagnostic label, a region of interest, length of stay, etc. For pedagogical ease, throughout this article, the classification of lung nodules will be used as a reference example.

The inputs to this nodule classifier are computed tomography (CT) images, but other modalities could have been used (e.g., X-ray or ultrasound). Each input image is associated with a two-class binary label (i.e., 0 or 1, indicating the absence or presence of calcified nodules, respectively). There is nothing special about the binary label; in other clinical applications, the label could represent several discrete classes (e.g., different types of lung nodules or disease stages) or be a continuous output as in regression (e.g., length of hospital stay, lab tests with continuous ranges).

Once CT images and associated labels are sourced and validated, a trained model learns relations between the image features (e.g., edges, contours, etc.) and their binary class (i.e., a positive or negative finding). However, this trained model may have also learned idiosyncratic features specific to the provided image and label pairs, which are not generally true in other data from the same modality (in this case, CT images). This generalization brittleness occurs for many reasons, including equipment with different noise sources (across different manufacturers), out-of-calibration effects, selection bias, population differences, and many others. Building generalizable models is paramount to clinical research. After all, the radiologist who developed the training data/labels can go to another hospital and provide the same expertise. At the same time, a working model at one medical center can fail at another. Therefore, it becomes key to understand the issues that might arise during the model training, validation, and testing processes.

Objective

To build disease classification models using Deep Neural networks and Random Forrest Classifier
To preprocess images using CV2 and improve model performance
To integrate trained models and create an app using flask

TOOLS

Task	Technique	Tools/Packages Used
Data Pre-processing and EDA	Image normaliaztion, Noise removal, Data Creation for Covid	cv2, shutil, sklearn, pandas, numpy
Model Developement	feature_selection, model_selection, model construction, optimization, neural network tunning, performance evaluation	Tensorflow, xgboost, sklearn
Data Visualization	Multi-attribute plots, heatmaps, correlation plots	matplotlib, seaborn
Environments & Platforms		MS Excel, Jupyter Notebook, Tensorflow, Pycharm

DATA-VISUALIZATION

Brain Tumour

Covid Model Performance

Vaccine Conversation Trends

Application with Flask

Output

RESULTS

Disease	Classifier Type	Accuracy
Pneumonia	CNN	83.17%
Heart Disease	XGBoost	86.96%
Diabetes	Random Forest	89.8%
Alzheimer	CNN	83.54%
Breast Cancer	Random Forest	91.81%
Brain Tumor	CNN, VGG16	96.5%
COVID-19	CNN	93.5%

CONCLUSION

Created seven disease classification models with TensorFlow, Random Forest and XGBoost to analyse patients’ medical records, achieving over 90% accuracy.Improved the accuracy of deep neural networks by 30% with image data augmentation and transfer learning

REFERENCES

Disease Classification Using Machine Learning Algorithms - A Comparative Study S.Leoni Sharmila1,∗ , C.Dharuman2 and P.Venkatesan3 1,2 Department of Mathematics, SRM University, Ramapuram Campus, Chennai - 600 089, India.

Development of machine learning model for diagnostic disease prediction based on laboratory tests Dong Jin Park, Min Woo Park, Homin Lee, Young-Jin Kim, Yeongsic Kim & Young Hoon Park

Machine-Learning-Based Disease Diagnosis: A Comprehensive Review Md Manjurul Ahsan,1, Shahana Akter Luna, and Zahed Siddique

A Review of Challenges and Opportunities in Machine Learning for Health Marzyeh Ghassemi,, Tristan Naumann, Peter Schulam, Andrew L. Beam , Irene Y. Chen, Rajesh Ranganath

CHALLENGES-AND-FUTUREWORK

Challenges : Identifying package for tweet scraping and recognizing limitations on extraction, large execution times and runtime errors due to memory limitation for parts of data modeling. Medical information is difficult to come by. As a result, if the databases were made public, researchers would have access to additional information.

Future Work

Explore new models to detect rare diseases

Number of active COVID cases, recoveries and deaths for the three months

deepakcr7ms7/Health_Care_App