/Data_Science_R-Lang_ML

Data Science (Data preprocessing) along with machine learning where patients with digestive and kidney diseases are predicted using(kNN, Naïve Bayes , and Random Forest) classifiers in R Programming Language

Primary LanguageRMIT LicenseMIT

Data_Science_R-Lang_ML

Data Science (Data preprocessing) along with machine learning classifier models in R language The dataset is selected from Kaggle which was the original dataset of the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether a patient has diabetes, based on certain diagnostic measurements included in the dataset. All patients here are females at least 21 years old of Pima Indian heritage. The source-link of the dataset: https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database

Predicting patients with digestive and kidney diseases(kNN, Naïve Bayes , and Random Forest) classifiers in R Programming Language The confusion matrix was generated for predicted class variable values for test dataset which was compared with the reference / reference datasets selected test data through all the classifiers. Then the ROC curve for all (kNN, Naïve Bayes , and Random Forest) classifiers were generated and thereby compared.