/Spam-classifier

Built an Email/SMS spam classifier, which give 97.19% accuracy and 100% precision using Naive Bayes.

Primary LanguageJupyter Notebook

Model deploy using Streamlit on Heroku platform


In this project I built a model for classifying the Email/SMS into Spam or Ham through the text of Email/SMS using standard classifiers.


What it does :


Live Demo :


How it does :

Extract the text and the target class from the dataset. Extract the features of the test using TF IDF vectorizer for the input features. Split the skewed data into shuffled sets using stratified shuffle split in sklearn library. Use standard classifiers to classify the data into spam or ham.



Prerequisites :

  • Python
  • scikit-learn/sklearn
  • Pandas
  • NumPy
  • nltk
  • Matplolib
  • Jupyter/Spyder/Pycharm

Dataset :

You can collect raw dataset from here. The files contain one message per line. Each line is composed by two columns:

  • Class(v1)- contains the label (ham or spam)
  • Message(v2) - contains the raw text.

Model Pipeline :


Accuracy Result :

Considering overall performance of Precision and Accuracy

Since NB has the best Accuracy and Precision, Naive Bayes is the model.