Predicting the Distribution of Taxis in New York City

Project Website

Project Screencast Video

IPython Notebooks:

1. Setup Project

2. Data Cleaning and Aggregation on AWS using Spark

3. Spark Output to CSV

4a. Machine Learning: Random Forest Prediction - Average Days Week

4b. Machine Learning: Random Forest Predicting the Future - Specific Days

4c. Machine Learning: k-Nearest Neighbors Prediction - Average Days Prediction

5a. Destination Data Cleaning and Aggregation

5b. Machine Learning: Predicting Destination (Pandas and sklearn)