100 Days of Machine Learning - Day 13

Support Vector Machines (SVM)

In this project, I used Support Vector Machines (SVM) for classification tasks. I implemented an SVM classifier, visualized the results, and performed a unit test to ensure the functionality of the classifier.

Prerequisites
Dataset
Step by Step Guide
Unit Test
Conclusion

Prerequisites

Python 3.x
scikit-learn
matplotlib
numpy

Dataset

I used the Breast Cancer Wisconsin dataset from the UCI Machine Learning Repository. The dataset contains features computed from digitized images of fine needle aspirate (FNA) of breast masses, and the goal is to predict whether the mass is malignant or benign. The dataset is included in the scikit-learn library.

Step by Step Guide

Load the dataset
Preprocess the data (split into train and test sets)
Train the SVM classifier
Visualize the results
Perform a unit test

You can find the complete code and detailed explanations in the Jupyter Notebook provided in this repository.

Unit Test

Included is a simple unit test in the notebook to ensure that the classifier's accuracy is above a certain threshold. The test checks if the accuracy is above 0.8, and if not, it raises an AssertionError with a message.

Conclusion

By following this tutorial, you should have a solid understanding of Support Vector Machines and how to implement them in Python using scikit-learn. Keep practicing with different datasets and SVM parameters to gain a deeper understanding of the algorithm and how to fine-tune it for specific problems.

Happy coding, and don't forget to share your progress using the hashtag #100DaysofML!

nadinejackson1/support-vector-machines