/Spam-Dectctor

Udemy Course

Primary LanguagePython

Spam-Dectctor

• Explored the corpus of SMS messages, performed data visualization with Seaborn library and Matplotlib library.

• Performed tokenization on the messages. Removed punctuations and stop words with nltk library. Returned clean version of the words.

• Performed vectorization and converted each of the messages into a vector that machine learning models can understand using bag-of-words model. Acquired the sparse matrix and converted the entire bag-of-words corpus into a TF-IDF corpus with Scikit Learn library.

• Trained and compared the spam/ham classifier with Naïve Bayes and Random Forest classifier algorithm. Made predictions and created classification reports.

• Created a data pipeline to set up all the transformation for future use.