Drugs-Classification

Introduction

Cardiovascular diseases (CVDs) are disorders of the guts and blood vessels. Examples include coronary heart disease, cerebrovascular disease, rheumatic heart disease and other conditions.

Machine learning is an art of mastering system without being explicitly computed. They are used to analyze the analytical arrangement in high dimensional, diverse data sets like heart diseases.

Objective

The aim of this report is to diagnose the disease at an initial phase. Using machine learning algorithms, the disease can be diagnosed at an initial stage and help to cure disease with a conventional diagnosis.

Machine Learning Model

To develop the model, machine learning algorithms has been used to resolve the problem. In this model, the focus is on developing a system to help medical professionals to evaluate the risk of heart disease of a patient based on the patient’s clinical data. Heart disease prediction is done using machine learning where the parameters used are Age, Sex, Blood pressure, Cholesterol level and Na-K concentration. Logistic regression is a process of modeling the probability of a discrete outcome given an input variable

Why Logistic Regression?

Mathematically, a logistic regression model predicts P(Y=1) as a function of X. It is one of the simplest ML algorithms that can be used for various classification problems. Logistic regression(LR) is a machine learning method, it has the advantages of easy implementation, good explanatory , and easy expansion. The main uses of logistic regression in medical data are usually divided into the following two points: finding the main influencing factors and predicting the incidence.

Block-Diagram

Data-set

The data has been collected from a drug company which was uploaded in Kaggle that contains labelled data set of the drugs and the parameters that effect it. Therefore, on the basis of the disease and the type of patient the drug is recommended. The parameters are age, sex, gender, blood pressure, cholesterol level and Na-K concentration. The dataset contains 200 rows and 6 columns. The categories (independent variable) were 5 drugs (Captopril, Digoxin, Atropine, Theophilline, Isoprenaline)