
Using Natural Language Processing (NLP) and pandas, numpy, scikit-learn for classification and applying logistic regression as it is a supervised model, lastly NLTK. Pickle library used for saving and running the model anywhere.

Primary LanguagePythonMIT LicenseMIT


I have used sentiment analysis on 4 different datasets that focus on its tweets, ratings or reviews. These are :

  1. Twitter Dataset
  2. Chat Dataset
  3. Drugs review dataset

Three types of sentiment are covered - Positive, Negative and Neutral.

Models Used

  1. Logistic Regression
  2. Multiclass LR
  3. One vs Rest LR
  4. Naive Bayes (Gaussian and Multinomial)
  5. SVM with linear or rbf kernels

Datasets used

  1. https://www.kaggle.com/datasets/abhi8923shriv/sentiment-analysis-dataset
  2. https://www.kaggle.com/datasets/kazanova/sentiment140
  3. https://www.kaggle.com/datasets/nursyahrina/chat-sentiment-dataset
  4. https://www.kaggle.com/datasets/mohamedabdelwahabali/drugreview

Colab Links

  1. Twitter model: https://colab.research.google.com/drive/1-IA0xgwLEJ1JcgpTRtjcTftv83ZggA-t?usp=sharing
  2. Twitter model 2: https://colab.research.google.com/drive/1H--X9_GQy2D-59URXNHzCCQ7OJBsIqpX?usp=sharing
  3. Chat model: https://colab.research.google.com/drive/1ndrjVt2CIc77pVsHQ1wa5hSrYTwkx8mC?usp=sharing
  4. Drug review model: https://colab.research.google.com/drive/1hWVuJFKAOkGlYLWJR4KrdpVoS9Zkbr-i?usp=sharing

Made by Sagardeep Das