Show probability of SMS is spam or ham
This repository of project has written with flask
and used base of machine learning algorithm such as Logistic Regression, etc.
Attached .sql file in Jupyter notebook
dataset has two columns :
- Category
- Content
First, content need to preprocess as far as I could, I did it
but if you know another solution, please tell me
Algorithm
- MultionialNB
- LogisticRegression
- SGDClassifier
Preprocessing
- Expanding Contraction
- lowercase
- remove numbers
- remove punctuation
- remove accented characters
- remove extra whitespaces and tabs
- remove stop words
- text stemming
Flask application
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.