Sentiment Analysis
Models Implemented:
- SGD - Stochastic Gradient Descent
- MNB - Multinomial Naive Bayes
- BNB - Bernoulli Naive Bayes
- KMC - K-Means Clustering
- MLP - Multilayer Perceptron
The folders are as follows:
- models: all the model files mentioned above are present here
- test: there are 2 test files, one for KMC and the other for all the models
- plotting: ipynb file for plotting results comparing all models
Instructions for running and testing models:
- Run stream.py and appropriate script file for model to train:
python3 stream.py -f sentiment -b batchsize
$SPARK_HOME/bin/spark-submit script.py > out.text - Run stream.py (uncomment test) and run testfile.py with appropriate batch size