/Spam-Classification-a-comparison-of-4-embedding-methods

Comparing 4 word embedding techniques (TF-IDF, Word2vec, Doc2vec, RNN) in a Spam classification example.

Primary LanguageJupyter Notebook

Spam Classification: comparing 4 embedding methods

Leveraging a Spam Text Classification example, different word embedding techniques were compared. The first 3 were paired with a simple Random Forest.

Precision Recall Accuracy
TF-IDF 100% 83.1% 97.8%
word2vec 60% 22.3% 87.7%
doc2vec 86.8% 44.6% 91.7%
RNN 98.7% 96.9% 93.3%