Comment-Verification

DigikalaNext contest - 2019

Implemented a comment verification system using CountVectorizer, Bayes algorithm and Digikala website's comments dataset

Scikit-learn's Countvectorizer :

Transform a given text into a vector on the basis of the frequency (count) of each word that occurs in the entire text.
Bayes theorem

The “prior” P(A) and the “evidence” P(B) are the probabilities of observing A and B independently in the document, whereas the “posterior” and the “likelihood” are the conditional probabilities of observing A given B and vice versa.

In this project what we are going to find is this:

While x is a feature vector containing the sequence of words in the given comment.

The “Naive” assumption that the Naive Bayes classifier makes is that the probability of observing a word is independent of others. Therefore, the probability of that comment being a spam is the product of seeing each of the words in the comment if a spam comment.

parastooAflaki/Comment-Verification

Comment-Verification