/Text_Messgae_Spam_Classifier

A Machine Learning Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like MultinomialNB, LogisticRegression, SVC, DecisionTreeClassifier, RandomForestClassifier, KNeighborsClassifier, AdaBoostClassifier, BaggingClassifier, ExtraTreesClassifier, GradientBoostingClassifier, XGBClassifier to compare accuracy.

Primary LanguageJupyter NotebookMIT LicenseMIT

📌 Introduction :-

A Machine Learning Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like MultinomialNB, LogisticRegression, SVC, DecisionTreeClassifier, RandomForestClassifier, KNeighborsClassifier, AdaBoostClassifier, BaggingClassifier, ExtraTreesClassifier, GradientBoostingClassifier, XGBClassifier to compare accuracy and using various data cleaning and processing techniques like PorterStemmer, CountVectorizer, TFIDF Vetorizer. It is implemented using MultinomialNB to gain accuracy of 97.09%.

✔❌Accuracy :-

Text Preprocessing Type GaussianNB Multinomial NB BernoulliNB
TFIDF Vectorizer + PorterStemmer 86.94% 97.09% 98.35%
CountVectorizer + PorterStemmer 88.00% 96.42% 97.00%

🏁 Datasets Used:-

  • The dataset used is SMS Spam Dataset created by UCI Machine Learning. This dataset is also available on kaggle. For instance, to download this dataset click here.

📎 Workflow :-

  • Loading Data
  • Data Cleaning
  • EDA
  • Data Preprocessing
  • Classification Model Building
  • Classification Model Testing
  • Model Results
  • Performance Evaluation

LICENSE

MIT