Optimize a tech non-profit's street team placements by determining the ideal NYC subway station locations at which to deploy. Recommendations based on analysis of NYC MTA turnstile traffic data and philanthropic contributions by zip code.
For more information, see my blog post.
datamunge.py
parses MTA turnstile data and returns total traffic counts for each station in the given time periodanalysis.py
integrates philanthropic data and determines optimal stations; returns top stations chartsdata/
contains list of MTA data files used indatamunge.py
contributions.csv
contains contributions data by zip and station name for top stationspresentation/
contains pdf presentation of findings & recommendations
$ git clone https://github.com/dianalam/streetteam-mta-analysis.git
Scripts were written in Python 2.7. You'll need the following modules:
matplotlib >= 1.5.1
numpy >= 1.10.1
pandas >= 0.17.1
python-dateutil >= 2.4.2
To install modules, run:
$ pip install <module>
# parse data
$ python datamunge.py
# run analysis
$ python analysis.py
Note that repo comes with default MTA turnstile data for the period between April and May of 2015. To use data
from a different time frame, download the .txt
files from the MTA website and save in data/
directory. The
script will run on all files in that directory.
To obtain additional contributions data, visit The Chronicle of Philanthropy and input zip code information for your station.
Thanks to: