!pip install pandas
!pip install scikit-learn
Open Prompt and run command
!python SpamFilter.py
It will ask you to enter a message.
Please enter your Message :- $message
It will show wheater it is
Spam/Not Spam
To make SpamFilter using Python we can use Multiple ML and Deep Learning Frameworks. In this app I’ve Used SVM (Support Vector Machine) Algorithm which has 98% Accuracy for the given Data.
Data I’ve used from the given link http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/. I downloaded it in .txt format.
Step 1: Data Preprocessing As it was a txt file I used Data Manipulation (Slicing) to separate the Label and the Feature which are Type and Message respectively. By using Pandas dataframe I approached to divide the Data into 2 columns [ Message , Type ]. Message contains the string and Type is the category of Message, whether it is Spam or ham.
Step 2: Data Splitting and Model Selection Divide the Data for training and testing. As we already have our own data to use. We need to train with ML or DL models to get output. In this app I’ve used Scikit learn and Pandas Libraries.
train_test_split is a method which is responsible for dividing your data into training and testing sets with the testing ratio you define. I gave a 20% Testing ratio which means I will have 2 random sets of data. 80% for training and 20% for Testing. By using SVM Model I tain the Dataset and fit the Test set into that Model
Step 3: Model Accuracy Model Score After Training your Dataset with model you can test the remaining 20% of Testing data to Test using model.score(X_test,y_test))
This model is 98% Accurate using SVM (Support Vector Machine) algorithm We can also try various Algorithms like NLTK Module
Step 4 : Final Output To check the input message Accuracy Score if it is more than 0.7 Print it as SPAM or else Print NOT SPAM
I can try more algorithms and try to create new data from the inputs and I can merge the Type and input message to the current Dataset which will increase of Data that lead us to more Training and Accurate model