Sentimental Analysis on Airline Tweets

Summary

  • Web Application to classify the tweets on US Airlines into Negative, Positive or Neutral classes.
  • Used NLP techniques (tokenization, n-grams, stemming, stopwords removal) and other Machine Learning (Bayesian networks, Neural networks, SVM, Lexical bag of words) algorithms in R and Python
  • Comparison of the prediction accuracy of different supervised classifiers used.

Built with

  1. Python
  2. R

Steps

  1. Python
    Install the following modules

    • tweepy - a Python wrapper for the Twitter API. Documentation, Streaming_how_to
    • textblob - library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.
  2. R
    Packages:

    • tm package: A framework for text mining applications within R. It does a good job for text cleaning (stemming, delete the stopwords, etc) and transforming texts to document-term matrix (dtm).
    • wordcloud