SMS-Spam-Classifier



Python Version Contributions welcome NLTK Version Streamlit Version scikit-learn Version

This is a repository containing code for an SMS spam classifier built using the Multinomial Naive Bayes algorithm and NLTK (Natural Language Processing Toolkit). The classifier is implemented using supervised learning techniques and can effectively classify SMS messages as either spam or non-spam.

Requirements

  • Python 3.6 or higher
  • NLTK
  • Scikit-learn
  • Streamlit

Installation

Clone the repository:

git clone https://github.com/tejred213/SMS-Spam-Classifier.git

Install the required dependencies:

pip install -r requirements.txt

Usage

Make sure you have the NLTK stopwords corpus downloaded. If not, open a Python shell and run the following commands:

import nltk
nltk.download('stopwords')
nltk.download('punkt')

To start the Streamlit app:

streamlit run app.py

Screenshot

image

image

Dataset

The Dataset used in this application is downloaded from kaggle : https://www.kaggle.com/datasets/uciml/sms-spam-collection-dataset

The SMS Spam Collection is a set of SMS tagged messages that have been collected for SMS Spam research. It contains one set of SMS messages in English of 5,574 messages, tagged acording being ham (legitimate) or spam.