/tumor-classification-pca-svm

Principal Component Analysis and Support Vector Machine for tumor classification in patients from the original Wisconsin breast cancer database.

Primary LanguageJupyter Notebook

Principal Component Analysis and Support Vector Machine for tumor classification

Analysis of the Breast Cancer Wisconsin (Original) Dataset

Original data in the UCI Machine Learning Repository: https://archive.ics.uci.edu/dataset/15/breast+cancer+wisconsin+original

Kaggle notebook: https://www.kaggle.com/raulalmuzara/pca-and-svm-for-tumor-classification

699 patients who may have a benign tumor (Class = 2) or a malignant tumor (Class = 4). In addition to a Sample Code Number, there are 9 biological features rated with integer values from 1 to 10: Clump Thickness, Uniformity of Cell Size, Uniformity of Cell Shape, Marginal Adhesion, Single Epithelial Cell Size, Bare Nuclei, Bland Chromatin, Normal Nucleoli and Mitoses. We will reduce them into 2 variables with Principal Component Analysis. Then, we will classify the patients with a Support Vector Machine. The objective is to predict the tumor class (benign or malignant) of a patient.