Overview

This code serves as an extension to Sanders Analytics twitter sentiment corpus, originally designed for training and testing Twitter sentiment analysis algorithms.

Updated

Updated to be Python 3 compatible and run against the current Twitter API.

Running this script will generate ~5K hand-classified tweets. For more information, please refer to 'readme.pdf'.

Installation

  1. Install tweepy lib
pip install tweepy
  1. Create a Twitter app, and update the global authentication properties in 'install.py'.
  2. Run python install.py. If it fails with an SSL error, run as a superuser - sudo python install.py.
  3. Hit enter three times to accept the defaults (make sure rawdata folder exists), or set your own paths.
  4. Wait until completion (~1.5h). You should have a new file called full-corpus.csv with the entire labeled dataset.

Enjoy data-mining :)!