In this project, I used Support Vector Machines (SVM) for classification tasks. I implemented an SVM classifier, visualized the results, and performed a unit test to ensure the functionality of the classifier.
- Prerequisites
- Dataset
- Step by Step Guide
- Unit Test
- Conclusion
Python 3.x
scikit-learn
matplotlib
numpy
I used the Breast Cancer Wisconsin dataset from the UCI Machine Learning Repository. The dataset contains features computed from digitized images of fine needle aspirate (FNA) of breast masses, and the goal is to predict whether the mass is malignant or benign. The dataset is included in the scikit-learn library.
- Load the dataset
- Preprocess the data (split into train and test sets)
- Train the SVM classifier
- Visualize the results
- Perform a unit test
You can find the complete code and detailed explanations in the Jupyter Notebook provided in this repository.
Included is a simple unit test in the notebook to ensure that the classifier's accuracy is above a certain threshold. The test checks if the accuracy is above 0.8, and if not, it raises an AssertionError with a message.
By following this tutorial, you should have a solid understanding of Support Vector Machines and how to implement them in Python using scikit-learn. Keep practicing with different datasets and SVM parameters to gain a deeper understanding of the algorithm and how to fine-tune it for specific problems.
Happy coding, and don't forget to share your progress using the hashtag #100DaysofML!