/FDSI2_subway_data

Data science project where the new york subway dataset is analyzed

Primary LanguageHTMLMIT LicenseMIT

Project: Analyzing the NYC Subway Dataset

Data science project where the new york subway dataset is analyzed.

What's included?

The project is divided in three parts:

  • Data collect (python, http request, pandas);
  • Data analysis (pandasql, matplotlib, numpy);
  • Data processing (MapReduce);

Quick start

You can see the final result here

Structure

│   analyzing-subway-data-ndfdsi.html   # This is a html that shows the project execution
│   analyzing-subway-data-ndfdsi.ipynb  # This file is a jupyter notebook and contains all the source code and instructions
├───data # This directory will be generated will all raw data 
└───output # This directory contains all processed data
        mapper_result.txt
        reducer_result.txt

Copyright and license

Code and documentation copyright 2016-2017 Code released under the MIT License

Authors

Original Author and Development Lead