Multilingual Sentiment Analyzer

Steps to run the code:

  $ python3 -m venv pytext_venv
  $ source pytext_venv/bin/activate
  (pytext_venv) $ cd pytext 
  (pytext_venv) pip install .

Download Spanish glove embeddings to data/embeddings directory. You can find the embeddings here. Unzip the file.
Run the main file. It takes a json file as an input and assumes that 'text_m' is the field with tweet text. It adds a json object called 'extension' for each tweet with predicted sentiment and sentiment scores.

  $ source pytext_venv/bin/activate
  $ (pytext_venv) python3 main.py --input <input json file> --out <output json file>

data: Contains isol lexicon (for logistic regression classifier), tass dataset (to train Spanish sentiment analyzer) and Spanish glove embeddings.
logistic_regression: Code for logistic regression classifier for Spanish sentiment analysis
pretrained_models: Pretrained models (logisitic regression and CNN) for Spanish sentiment analysis
pytext: Code for CNN classfier for Spanish sentiment analysis
tweetment: Code for SVM classfier for English sentiment analysis
main.py: Code for multi-lingual sentiment analyzer (English and Spanish).