/Inbox_Defender

A Powerful Spam Mail Identifier.

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Inbox Defender

  • A Web Application that helps you to detect whether the mail or message is spam or not based on the input (mail) given by the user.
  • Data preprocessing was done with the help NLTK and classification of the input given by the user is done with the help classification algorithms (Random classifier, Logistic Regression, Support Vector Machine)

CREATED BY

DATASET

Thanks to kaggle you can find my DATASET here..

The SMS Spam Collection is a set of SMS tagged messages that have been collected for SMS Spam research. It contains one set of SMS messages in English of 5,574 messages, tagged acording being ham (legitimate) or spam.

STEPS

  • Understanding data with EDA using Pandas and Matplotlib
  • Preprocessing using NLTK:
    • Remove punctuation and special symbols
    • Convert to lowercase
    • Tokenize into words
    • Remove stopwords
  • Create word clouds for visual representation of words in Spam and Ham messages
  • Convert messages into numerical representation with TF-IDF
  • Train several machine learning models (Naive Bayes, SVM, Random Forest,Logistic Regression)
  • Evaluate models based on accuracy, precision, recall, and F1-score
  • Determine best performing model and use it to classify new messages and deploy them with django

TOOLS

  • python
  • jupyternotebook- modelbuilding
  • Backend : django
  • Frontend : HTML/CSS

ALGORITHMS

  • logistic regression
  • Random forest classifier
  • SVM

WEB PAGE

image

OUTPUT

image