/Breast-cancer-detected-ann-model

A challenging problem for ML .Machine Learning gives better results on linear data. It is also concluded from the previous research, when the data is in the form of images where the machine is failed. To solve the problem of machine learning techniques, an innovative technique is used.

Primary LanguageJupyter Notebook

Breast Cancer Detection Deep Learning ANN Model

Introduction-:

Cancer arises when the abnormal body’s cells start to separate and come in contact with normal cells and make them malignant. Breast cancer is most frequently occurring and harmful disease in the world. Breast cancer considered either invasive or non-invasive. Invasive is cancerous, malignant and spreads in other organs. Non-invasive is pre-cancerous, remains in its original organ. It eventually develops into invasive breast cancer. The portion of body that contains the breast cancer is glands and milk ducts that carry the milk. Breast cancer spread to other organs frequently and make them malignant. It also spreads through the bloodstream to other organ. Breast cancer has many types and the rate of growing is also different. According to WHO, 627,000 women died from the breast cancer in 2018. Breast cancer is the main problem that spreads everywhere in the world but mostly found in United State of America. There are four types of breast cancer. First type of cancer is Ductal Carcinoma in Situ that found in the coating of breast milk ducts and it is pre-stage breast cancer. Second type of breast cancer is most popular disease and contains upto 70-80% diagnosis. Third type of breast cancer is Inflammatory breast cancer which is forcefully and quickly developing breast cancer in this disease cells penetrate the skin and lymph vessels of the breast. The fourth type of breast cancer is Metastatic breast cancer which is spreads to other parts of the body.

Python Libraries which used in this project-:

Numpy Pandas Matplotlib Seaborn Sklearn etc. For version see the requirements.txt file.

Data Source-:

Kaggle Dataset

Data Analysis or Feature Scaling with EDA

Completely Analyse the data and droped the unimportant features. Correlation for target is much important for model.plot the diffirent types of chart with diffirent types of features. We scaled data with standard scaler and got better accuracy or dicrease the loss.

*Model Train with Diffirent types of Machine Learning algorithms *-:

Logistics Regression (LR)-: The approach uses more dependent variables and is supervised learning. The result of this procedure is a binary number. For a certain set of data, logistic regression can provide a continuous outcome. This approach uses a statistical model with binary variables .

K-Nearest Neighbor (KNN)-: This approach is used to recognise patterns. It is a useful method for forecasting breast cancer. The same amount of time was spent on each class in order to identify the trend. K Nearest Neighbour pulls the relevant highlighted data from a huge dataset. We use feature similarity to categorise a huge dataset .

Decision Tree (DT)-: Decision trees are built using classification and regression models . There are fewer subsets of the data set. These smaller data sets may be used to make predictions with the highest degree of accuracy.

Naive Bayes Algorithm (NB)-: This method makes use of a huge training dataset. The algorithm used to determine probability uses the Bayesian approach . It provides the highest level of accuracy for estimating the input probabilities of noisy data . This classifier compares training datasets and training tuples using analogies .

Support Vector Machine (SVM)-: This supervised learning method addresses both classification and regression concerns . In order to deal with the regression issue, it employs mathematical and theoretical functions. It delivers the highest accuracy rate when making predictions using a large dataset. It is a potent machine learning technique that is based on 3D and 2D modelling.

Random Forest (RF)-: Classification and regression problems are handled using the Random Forest algorithm , which is based on supervised learning. It is a machine learning building block that utilises historical datasets to forecast fresh data .

K Mean Algorithm-: Data can be sorted into manageable categories using the clustering technique K mean. Algorithms are used to assess how similar different data points are to one another. Every data point contains the most suitable cluster for analysing a sizable dataset . XGBOOST solve classification or regression problem and provides better accuracy in machine learning models. etc.

Diffirent types of methods for Testing or Model Ananlysis-:

confusion metrix classification report precision or recall scores roc-curve etc. model.summary

Conclusion-:

Breast cancer detection is a challenging problem because it is most popular and harmful disease. Breast cancer is growing every year and there is less chance to recover from this disease. For detection of breast cancer, machine learning and deep learning techniques are used. It is concluded from the previous research, the machine learning techniques give better results in their own field. The previous research is conducted through many machine learning techniques with some enhancement and augmentation in dataset for the better performance. But it is concluded that machine learning gives better results on linear data. It is also concluded from the previous research, when the data is in the form of images where the machine is failed. To solve the problem of machine learning techniques, an innovative technique is used.

Project End