In the paper I will present a statistical analysis of stroke related data. My goal will be to predict whether a patient is likely to get a stroke based on multiple input parameters like gender, age, various diseases, and smoking status. Each row in the data provides relevant information about the patient. After some data cleaning, we plotted some interesting variables of our dataset to get some insight on the content and structure of our data. We then proceeded to the analysis