/Diabetes_Prediction

Predicting if a patient is diabetic or not by training a classification model using both Logistic regression and Artificial neural networks

Primary LanguageJupyter NotebookMIT LicenseMIT

Diabetes_Prediction_using_LogisticRegression_and_NeuralNetworks

The project deals with creating a classification model that could predict if a patient is diabetic or not. Initially, Logistic regression algorithm is used to train the dataset ' Pima Indians Diabetes Database', and later, Artificial neural network is built on the dataset to classify the patients. There was not much difference observed in the test accuracy of the models as the accuracy using ANN is 79% which is just 1% more than the accuracy obtained using Logistic Regression.

Over all, logistic regression model with a total accuracy of 78% is developed with gradient descent and optimal data preprocessing strategies. The validation dataset helped in accessing the threshold value accurately and mean data imputation helped in giving more significance to more than 50% of the features in the dataset which in turn helped in building an efficient model. Further, the neural network model was built which is producing a total test accuracy of 79%, which is greater than that of logistic regression model accuracy. This is mainly because of efficient regularization techniques and optimizer, along with flexible hyper-parameter tuning that has been used during the building of neural network model.