In this project I evaluate the use several common machine learning algorithms to classify tumors as malignant or benign, using data from Breast Cancer Wisconson dataset. I also explore different feature extraction techniques and combine them with various ML methods to obtain the highest possible accuracy.
The dataset consists of ten real-valued features obtained from images of fine needle aspirate of breast mass :
- radius (mean of distances from center to points on the perimeter)
- texture (standard deviation of gray-scale values)
- perimeter
- area
- smoothness (local variation in radius lengths)
- compactness (perimeter^2 / area - 1.0)
- concavity (severity of concave portions of the contour)
- concave points (number of concave portions of the contour)
- symmetry 10.fractal dimension ("coastline approximation" - 1)