/Exploratory-Data-Analysis-Haberman-s-Cancer-Survival-Dataset

Haberman’s data set contains data from the study conducted in University of Chicago’s Billings Hospital between year 1958 to 1970 for the patients who undergone surgery of breast cancer. Source :https://www.kaggle.com/gilsousa/habermans-survival-data-set)

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

What is Exploratory Data Analysis?

Exploratory Data Analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. EDA is for seeing what the data can tell us beyond the formal modelling or hypothesis testing task. It is always a good idea to explore a data set with multiple exploratory techniques, especially when they can be done together for comparison. The goal of exploratory data analysis is to obtain confidence in your data to a point where you’re ready to engage a machine learning algorithm. Another side benefit of EDA is to refine your selection of feature variables that will be used later for machine learning.

Understanding the dataset Title: Haberman’s Survival Data Description: The dataset contains cases from a study that was conducted between 1958 and 1970 at the University of Chicago’s Billings Hospital on the survival of patients who had undergone surgery for breast cancer. Attribute Information: Age of patient at the time of operation (numerical) Patient’s year of operation (year — 1900, numerical) Number of positive axillary nodes detected (numerical) Survival status (class attribute) : 1 = the patient survived 5 years or longer 2 = the patient died within 5 years