/Bernoulli-Document-Model_Based-Naive-Bayes-SMS-Spam-Classification

This code is for Naive Bayes Spam Classification on the SMS Spam Collection Data Set from the UCI Machine Learning Repository.

Primary LanguagePython

Bernoulli-Document-Model_Based-Naive-Bayes-SMS-Spam-Classification

This code is for Naive Bayes Spam Classification on the SMS Spam Collection Data Set from the UCI Machine Learning Repository.

This particular version of Naive Bayes is based on the The Bernoulli document model classification principle. The Maximum A posteriori Parameter Estimation Technique was used to compute the word Probabilities. The Beta distribution with Beta(2,1) was used as a prior.

The Preproceesing part involved the following steps : 1)Removal of trailing spaces 2)Removal of Non Words 3)Removal of Stop Words 4)Lemmatization