Twitter-Influenza-Outbreak-Prediction

The contents of this repo are the programming deliverable for a term project for class CS535 (Big Data) at Colorado State University.

A high-level view of the project is using historical Centers for Disease Control and Prevention (CDC) data and historical/real-time Twitter data regarding influenza (flu) related posts to attempt to predict the beginning occurance of a Flu outbreak before it appears in the CDC data (which lags ~2+ weeks behind reality).

The languages/frameworks used are Apache Storm (Java) and Python for the prediction model.