/Drug_rating_predictions

Drug ratings and reviews from patients: predicting overall rating using machine learning models for numerical and language processing methods

Primary LanguageJupyter Notebook

Drug_rating_predictions

Drug ratings and reviews from patients: predicting overall rating using machine learning models for numerical and language processing methods

Please note, the drug training and testing data set can be found and downloaded here: http://archive.ics.uci.edu/ml/datasets/Drug+Review+Dataset+%28Drugs.com%29

Abstract: Using a drug review dataset gathered from online pharmaceutical websites, machine learning methods are employed on numerical data to determine if patient ratings given for efficacy and side effects have a linear relationship with overall rating, and whether univariate and multivariate logistic regression can accurately predict whether a patient's overall rating of a prescription drug will be positive or negative. Preprocessing, tf-idf vectorizing, ridge classification, and data visualization are used to analyze whether text review fields can accurately predict a positive or negative patient review.

All data ETL and processing steps, as well as explanation of outcomes, can be found in the drugs final.ipynb file.