- Install pytext. (Python version > 3.6)
$ python3 -m venv pytext_venv
$ source pytext_venv/bin/activate
(pytext_venv) $ cd pytext
(pytext_venv) pip install .
-
Download Spanish glove embeddings to data/embeddings directory. You can find the embeddings here. Unzip the file.
-
Run the main file. It takes a json file as an input and assumes that 'text_m' is the field with tweet text. It adds a json object called 'extension' for each tweet with predicted sentiment and sentiment scores.
$ source pytext_venv/bin/activate
$ (pytext_venv) python3 main.py --input <input json file> --out <output json file>
-
data
: Contains isol lexicon (for logistic regression classifier), tass dataset (to train Spanish sentiment analyzer) and Spanish glove embeddings. -
logistic_regression
: Code for logistic regression classifier for Spanish sentiment analysis -
pretrained_models
: Pretrained models (logisitic regression and CNN) for Spanish sentiment analysis -
pytext
: Code for CNN classfier for Spanish sentiment analysis -
tweetment
: Code for SVM classfier for English sentiment analysis -
main.py
: Code for multi-lingual sentiment analyzer (English and Spanish).