This code serves as an extension to Sanders Analytics twitter sentiment corpus, originally designed for training and testing Twitter sentiment analysis algorithms.
See also guyz / twitter-sentiment-dataset.
Running this script will generate ~5K hand-classified tweets. For more information, please refer to 'readme.pdf'.
- Install tweepy lib
pip install tweepy
- Create a Twitter app, and update the global authentication properties in 'install.py'.
- Run
python install.py
. If it fails with an SSL error, run as a superuser -sudo python install.py
. - Hit enter three times to accept the defaults (make sure rawdata folder exists), or set your own paths.
- Wait until completion (~1.5h). You should have a new file called
full-corpus.csv
with the entire labeled dataset.