Text Mining and Sentiment Analysis
Harvest tweets from Twitter for Samsung Galaxy and Apple iPhone from four cities-- London, Los Angeles, Delhi and New York. Perform opinion scoring and make stastical inferences.
- Using OAuth for R-Twitter Handshake
- Scraping twitter using searchTwitter() in the twitteR package.
- Obtaining the text content of the tweets from the status class
- Text cleaning using regular expression with GSUB (removing URL, punctuation, control characters, digits and hashes)
- Opinion/Sentiment lexicon for matching each word against
- Bing Liu sentiment lexicon for scoring each hit. +1 for positive and -1 for negative
- Identify copied subsequent tweets and create a dataframe.
- Format the dataframe and export it as a csv.
- Import csv to Tableau
- Visualize (Screenshots in "Images")
Issues with the current implementation
- Limited dataset because of Twitter's 14 days only scraping policy, therefore bias is possible.
- Text cleaning process may have removed keywords.
- Does't accurately score questions, cynical and sarcastic tweets.