/Avis

Text mining and sentiment analysis of Twitter data using R and Tableau

Primary LanguageR

Avis

Text Mining and Sentiment Analysis

Harvest tweets from Twitter for Samsung Galaxy and Apple iPhone from four cities-- London, Los Angeles, Delhi and New York. Perform opinion scoring and make stastical inferences.

R

  • Using OAuth for R-Twitter Handshake
  • Scraping twitter using searchTwitter() in the twitteR package.
  • Obtaining the text content of the tweets from the status class
  • Text cleaning using regular expression with GSUB (removing URL, punctuation, control characters, digits and hashes)
  • Opinion/Sentiment lexicon for matching each word against
  • Bing Liu sentiment lexicon for scoring each hit. +1 for positive and -1 for negative
  • Identify copied subsequent tweets and create a dataframe.
  • Format the dataframe and export it as a csv.

Tableau

  • Import csv to Tableau
  • Visualize (Screenshots in "Images")


Issues with the current implementation
  • Limited dataset because of Twitter's 14 days only scraping policy, therefore bias is possible.
  • Text cleaning process may have removed keywords.
  • Does't accurately score questions, cynical and sarcastic tweets.

Sentiment Lexicon via: Minqing Hu and Bing Liu. "Mining and Summarizing Customer Reviews." Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004), Aug 22-25, 2004, Seattle, Washington, USA