Breast Cancer Detection

Breast Cancer Detection Using Machine Learning

What is Breast Cancer?

Cancer occurs when changes called mutations take place in genes that regulate cell growth. The mutations let the cells divide and multiply in an uncontrolled, chaotic way. The cells keep on proliferating, producing copies that get progressively more abnormal. In most cases, the cell copies eventually end up forming a tumor.

Breast cancer occurs when a malignant (cancerous) tumor originates in the breast. As breast cancer tumors mature, they may metastasize (spread) to other parts of the body. The primary route of metastasis is the lymphatic system which, ironically enough, is also the body's primary system for producing and transporting white blood cells and other cancer-fighting immune system cells throughout the body. Metastasized cancer cells that aren't destroyed by the lymphatic system's white blood cells move through the lymphatic vessels and settle in remote body locations, forming new tumors and perpetuating the disease process.

Breast cancer is not just a woman's disease. It is quite possible for men to get breast cancer, although it occurs less frequently in men than in women. Our discussion will focus primarily on breast cancer as it relates to women but it should be noted that much of the information is also applicable for men.

Facts And Figures

Breast cancer is the most commonly occurring cancer in women and the second most common cancer overall. There were over 2 million new cases in 2018.

Prevalence

  1. Asia

    Percentage of world population: 59 Percentage of new breast cancer cases: 39 Percentage of breast cancer deaths: 44

  2. Africa

    Percentage of world population: 15 Percentage of new breast cancer cases: 8 Percentage of breast cancer deaths: 12

  3. U.S. and Canada

    Percentage of world population: 5 Percentage of new breast cancer cases: 15 Percentage of breast cancer deaths: 9

(Data from Global Cancer Facts and Figures, 3rd Edition, page 37)

Incidence rates per 100,000 women

  1. Countries with highest incidence: The Netherlands: 95.3 France: 94.6 U.S: (white people only - other races have lower incidence): 90.6

  2. Countries with lowest incidence:

    Thailand: 25.6 Algeria: 29.8 India: 30.9

(Data from Global Cancer Facts and Figures, 3rd Edition, page 42)

The American Cancer Society's estimates for breast cancer in the United States for 2019 are:

  • About 268,600 new cases of invasive breast cancer will be diagnosed in women.

  • About 62,930 new cases of carcinoma in situ (CIS) will be diagnosed (CIS is non-invasive and is the earliest form of breast cancer).

  • About 41,760 women will die from breast cancer.

Role Of Machine Learning In Detection Of Breast Cancer

A mammogram is an x-ray picture of the breast. It can be used to check for breast cancer in women who have no signs or symptoms of the disease. It can also be used if you have a lump or other sign of breast cancer.

Screening mammography is the type of mammogram that checks you when you have no symptoms. It can help reduce the number of deaths from breast cancer among women ages 40 to 70. But it can also have drawbacks. Mammograms can sometimes find something that looks abnormal but isn't cancer. This leads to further testing and can cause you anxiety. Sometimes mammograms can miss cancer when it is there. It also exposes you to radiation. You should talk to your doctor about the benefits and drawbacks of mammograms. Together, you can decide when to start and how often to have a mammogram.

Now while its difficult to figure out for physicians by seeing only images of x-ray that weather the tumor is toxic or not training a machine learning model according to the identification of tumour can be of great help.

About The Dataset:

Data Set Characteristics:

:Number of Instances: 569

:Number of Attributes: 30 numeric, predictive attributes and the class

:Attribute Information:
    - radius (mean of distances from center to points on the perimeter)
    - texture (standard deviation of gray-scale values)
    - perimeter
    - area
    - smoothness (local variation in radius lengths)
    - compactness (perimeter^2 / area - 1.0)
    - concavity (severity of concave portions of the contour)
    - concave points (number of concave portions of the contour)
    - symmetry
    - fractal dimension ("coastline approximation" - 1)

    The mean, standard error, and "worst" or largest (mean of the three
    worst/largest values) of these features were computed for each image,
    resulting in 30 features.  For instance, field 0 is Mean Radius, field
    10 is Radius SE, field 20 is Worst Radius.

    - class:
            - WDBC-Malignant (cancerous) - Malignant tumors can grow rapidly, invade and destroy nearby normal tissues, and spread throughout the body.
            - WDBC-Benign (non-cancerous) - Benign tumors tend to grow slowly and do not spread

Project Description

The Project is Inspired by the Original Publication of...

1)Doç. Dr. Ahmet MERT Mühendislik ve Doğa Bilimleri Fakültesi > Mekatronik Mühendisliği Bölümü

2)Dr. Erdem Bilgili Piri Reis University

3)Dr. Aydin Akan Izmir Katip Celebi University, Izmir, Turkey

The Projects Features Detection of Breast Cancer Using Machine Learning. It has been tested that while there exists several machine learning models,Support Vector Machine or SVM in short is reported to have highest accuracy of (approximately 97%) in detecting breast cancer.

The dataset used in this project is from Breast Cancer Wisconsin (Diagnostic) Data Set, however it can be directly accessed from Scikit learn library's collection of datasets as...

sklearn.datasets.load_breast_cancer

...aslo csv file of data has been externally loaded in the repo :)

RESULTS

An accuracy of 96% was achieved by using SVM model.