Breast_Cancer

The project provides information about breast cancer to help doctors predict if a person has it.

Introduction

Breast cancer is a disease in which abnormal breast cells grow out of control and form tumours. If left unchecked, the tumours can spread throughout the body and become fatal.
Breast cancer cells begin inside the milk ducts and/or the milk-producing lobules of the breast. The earliest form (in situ) is not life-threatening. Cancer cells can spread into nearby breast tissue (invasion). This creates tumours that cause lumps or thickening.
Invasive cancers can spread to nearby lymph nodes or other organs (metastasize). Metastasis can be fatal.
Treatment is based on the person, the type of cancer and its spread. Treatment combines surgery, radiation therapy and medications.

Consider the data present in the Breast Cancer Dataset file

Following the attribute related information. This data set includes 201 instances of one class and 85 instances of another class. The instances are described by 9 attributes, some of which are linear and some are nominal.

Age
Menopause
inv-nodes
node-caps
deg-malig
breast
breast-quad
irradiat
Outcome (no-recurrence-events, recurrence-events)

Problem Statement

To diagnostically predict whether or not a patient has Breast Cancer, based on certain diagnostic measurements included in the dataset.

Steps Followed for the Project

Importing Necessary Libraries
Performing Exploratory Data Analysis
Data Preprocessing
Converting Categorical data to numerical. (Label Encoder)
Creating X and Y
Split the data into test and train
Performed various model such as Logistic Regression, Decision Tree, Random Forest, Extra_Tree_Classifier, SVC, KNeighbors.
Tuned the above model
Smote Implementation and again running all above models for better accuracy and low recall value

Conclusion