NYC Subway Data (Project 1)

This is a project created to satisfy the requirement of the Udacity Nanodegree for Data Science. It address the first problemset: ridership on the NYC Subway.


There are four images used in the answer essay, and they're included externally here.

Also, the zip's are each iteration I've taken through submitting this project. Short Questions.odt and Project 1 - Short Questions.pdf are the most current submission.


I had a lot of trouble trying to understand exactly how to interpret R2. It seemed to be something that having the benefit of experience would greatly help on, but after reading a ton of articles about R2, I think I have a better understanding of it now.

I am still not sure I fully understand how dummy variables are actually used. I feel like I have a good understanding of what they are, but I don't understand how they affect the creation of our linear regression model in Python, or even why I have so many dummy variables.
