Project Description

Provide all R code and solutions by knitting your final RStudio file into a single file.

Using the tweets.csv data that is available on the GitHub site, provide code to do the following

Identify all tweets with the word ‘flight’ in them
How many tweets end in a question mark?
How many tweets have airport codes in them (assume any three subsequent capital letters are airport codes)
Identify all tweets with URLs in them
Replace all instances of repeated exclamation points with a single exclamation point
Replace consecutive exclamation points, question marks, and periods with a single period, split the tweet on periods, and create a list where each element is a vector of the split strings from each tweet

You now have the fundamental R tools to complete this exercise, but you will may still have to explore new techniques and packages. You will work with the full text of the State of the Union speeches from 1790 until 2012. The speeches are all in the file stateoftheunion1790-2012.txt on the GitHub site. Read the text into R and manipulate it in order to create a data frame with the following summaries for each speech:

Hadley-Dixon/StateOfTheUnion