Usage

Get Twitter API keys.
- Create a file called APIKeys.json, and store your API keys in there. You can use APIkeyexample.txt as a reference.
- Note that this .json will not be pushed to git, unless you change the .gitignore.
Generate tweets for a user or set of users
- Navigate to the src directory
- Run python main.py --names <NAME1> <NAME2> ... where each of the NAMEi can be replaced with a twitter handle.
- The code will pull tweets and save them to the data directory
- This will also print generated tweets to the console
Determine sentence similarity
- Navigate to the src directory
- Run python model_test.py <tweet_file> <K>, where <tweet_file> is the relative path to a file in the data folder (for example, ../data/Harvard.csv), and K designates how big your K-mer will be. K must be at least 2.

Important files

main.py: Contains code to generate sentences given a list of Twitter handles at the command line.
model_generator.py: Contains functions to generate the Markov model for a user. This includes getting tweets from a file, extracting K-mers, forming the model, and determining next words given the current K-1 words.
model_test.py: Contains functions generate sentences from a model, and test their similarity to the original tweets. Note that when run as driver program, this file will default to determining sentence accuracy.
twitter_extractor.py: Contains functions to connect to Twitter API and extract tweets for user or users.
comparison.py: Contains functions to compare words/sentences for quantitative analysis.