Spam Detection using Machine Learning

This repository contains a Jupyter notebook for detecting spam messages using machine learning. The notebook is based on a CSV dataset of SMS messages, which includes labels indicating whether each message is spam or not.

Dataset

The dataset used in this notebook is the SMS Spam Collection Dataset, which consists of 5,572 SMS messages in English, tagged according to being ham (legitimate) or spam. The dataset is available in the data subdirectory of this repository as a CSV file named spam.csv.

Notebook

The notebook is implemented using Python, and uses the scikit-learn library for performing machine learning tasks. It is hosted on Google Colab, and can be accessed through the following link: https://colab.research.google.com/drive/1JIjaKCv_dUGemBYpE5H2OFgzxFjTN8hw?usp=sharing

Spam Detection Notebook

The notebook includes the following sections:

  • Data Loading and Exploration
  • Data Preprocessing
  • Feature Extraction
  • Model Training
  • Model Evaluation

The notebook provides detailed explanations of each step, as well as code snippets for implementing them.

How to Use

To use this notebook, simply click on the link above to open it in Google Colab. You can then run each cell of the notebook in order to reproduce the results presented in the notebook. Add the spam.csv file to your local Colab directories.

Credits

This notebook was created by Drhorhi Omar, and is licensed under the MIT License. The dataset used in this notebook is from the UCI Machine Learning Repository.