/Portfolio_Project_54

Data science model deployment project aimed at developing classification models for estimating lung cancer probabilities and serving a web application interface using Python and Streamlit.

Primary LanguageJupyter Notebook

This project aims to develop a web application to enable the accessible and efficient use of a classification model for computing the risk index, estimating the lung cancer probability and predicting the risk category of a test case, given various clinical symptoms and behavioral indicators. To enable plotting of the logistic probability curve, the model development process implemented the Logistic Regression model, either as an Independent Learner, or as a Meta-Learner of a Stacking Ensemble with Decision Trees, Random Forest, and Support Vector Machine classifier algorithms as the Base Learners, while evaluating for optimal hyperparameter combinations (using K-Fold Cross Validation), addressing class imbalance (using Class Weights, Upsampling with Synthetic Minority Oversampling Technique (SMOTE) and Downsampling with Condensed Nearest Neighbors (CNN)), imposing constraints on model coefficient updates (using Least Absolute Shrinkage and Selection Operator and Ridge Regularization), and delivering accurate predictions when applied to new unseen data (using model performance evaluation on Independent Validation and Test Sets). Creating the prototype required cloning the repository containing two application codes and uploading to Streamlit Community Cloud - a Model Prediction Code to compute risk indices, estimate lung cancer probabilities, and predict risk categories; and a User Interface Code to process the study population data as baseline, gather the user input as test case, render all user selections into the visualization charts, execute all computations, estimations and predictions, indicate the test case prediction into the logistic curve plot, and display the prediction results summary. The final lung cancer prediction model was deployed as a Streamlit Web Application.