My first python program for email spam classification.
The quick and dirty one day implementation to get started
To run the application python spam_classifier.py
First time with higher cost - 76 % accuracy - lower iteration 1544 correct 462 wrong
Second time with lower cost - 88% 1785 correct 221 wrong
With extra added words - 90% accurate 1817 correct 189 wrong
forgot to add bias when testing previously sighs
With bias added - 93% accurate 1866 correct 140 wrong
PS: Needs a lot of improvement in python The data set for email spam classification is taken from kaggle
The first ~ 3000 is taken for training the model and last ~ 2000 is used for testing the model. The part of features is taken from the internet for spam triggering words and other part is the frequently occuring word in the training data set.