Twitter-Search-and-Stream

This repo contains the code to script tweets using various APIs provided by Twitter.

Dependencies

Python Python 2 Package: Twython

File structure

Helper.py: It contains the key funtionalities that scrape data from Twitter calling Twitter's API
search-Andrew.py: It calls the functions in Helper.py to perform the two main tasks in this project
- (1) To Search for all the tweets that contain a specific "keyword within a certain timeframe (the hasttag mode)
- (2) To download all the past tweets in a user's timeline (The timeline mode)
stream.py: (currently not used in the project), running this code, it could get the most recent posts of users as live streams
siftuser.py: This file is not directly related to searching and downloading tweets. This

The pipeline

prepare the list of keywords and save them in targetfile (as mentioned below in the parameter section)
run search-Andrew.py using hashtag mode
sift out the user using their profiles and save the user list to another targetfile
run search-Andrew.py using user mode

Credentials

A twitter developer credential is needed to run the code.
My personal credential is stored in credential1.txt and credential2.txt. They can be used to test the code.
Yet I strongly recommend to apply for new ones for the purpose of future research.

The parameters of the code

In order to call the search-Andrew.py from command line, I added parameters parsing as a component of the program. The parameters and their meanings are as follow. Please search argparser package for detailed instructions.