/sfmta-data-analysis-ds

Primary LanguageJupyter NotebookMIT LicenseMIT

SFMTA Data Analysis

Take a look at our product here!

Contributors

Labs 22

Agustin Cody Vargas Jordan Ireland Mathias Ragnarson Skreden

Labs 24

Austie Robinson Isaac Grove Jonathan Duke Ramses Gasque

Project Overview

Product Canvas

Project Description:

This project began as a greenfield project proposed by Jarie Bolander in collaboration with Lambda School Labs
students. The aim of this project is to provide historical analysis of traffic flow within the SFMTA system.
We hope to give citizens, oversight committee members, and SFMTA staff accurate and timely historical data,
along with statistics and analysis, to make informed decisions for system wide improvements.

We are serving our reports generated from our data and analysis through datadriventransit.org.
Our raw data and analysis is available through our API.
further information on accessing and maintaining the API can be found here.

The Front End

Architecture

architecture diagram

The current architecture uses AWS Lambda functions to continually store data in our database, which is also hosted on AWS. Another AWS Lambda function reads data from the database, generates a daily report, and saves the report back to the database. The web back end connects to the database directly to get those reports, and passes them to the front end to display them.

We also have a working Flask app, but it is currently only used for testing purposes and the web team does not connect to it themselves. This could be used for other functionality in the future, such as requesting custom reports.

Tech Stack

Predictions

Considering the complexity, volume, and feature engineering required for this project, a significant amount of time was
invested in thinking about potential approaches to the analysis, pipeline engineering, and data storage.
Much of the exploratory and experimental work done by Labs 24 is available here;
similarly, exploratory work done by Labs 22 is available here.

We are not serving any predictions here, nor was that our goal. However, given the foundation now laid, it may be
within grasp of a future cohort to begin actual predictive modeling on this data in the form of predicting ETAs,
service disruptions, etc.

Data Sources

Our primary source of data is NextBus, via the RestBus API. This data consists of route and
schedule data made available by SFMTA, as well as detailed vehicle-level data for every active vehicle in the SFMTA
system, every minute. A detailed breakdown of this data is available here.

Contributing

Please note we have a code of conduct.

Please follow it in all your interactions with the project.

Issue/Bug Request

If you are having an issue with the existing project code, please submit a bug report under the following guidelines:

  • Check first to see if your issue has already been reported.
  • Check to see if the issue has recently been fixed by attempting to reproduce the issue using the latest master branch in the repository.
  • Create a live example of the problem.
  • Submit a detailed bug report including your environment & browser, steps to reproduce the issue, actual and expected outcomes, where you believe the issue is originating from, and any potential solutions you have considered.

Feature Requests

We would love to hear from you about new features which would improve this app and further the aims of our project.

Please provide as much detail and information as possible to show us why you think your new feature should be implemented.

Pull Requests

If you have developed a patch, bug fix, or new feature that would improve this app, please submit a pull request.
It is best to communicate your ideas with the developers first before investing a great deal of time into a pull request to ensure that it will mesh smoothly with the project.

Remember that this project is licensed under the MIT license, and by submitting a pull request, you agree that your work will be, too.

Pull Request Guidelines

  • Ensure any install or build dependencies are removed before the end of the layer when doing a build.
  • Update the README.md with details of changes to the interface, including new plist variables, exposed ports, useful file locations and container parameters.
  • Ensure that your code conforms to our existing code conventions and test coverage.
  • Include the relevant issue number, if applicable.
  • You may merge the Pull Request in once you have the sign-off of two other developers, or if you do not have permission to do that, you may request the second reviewer to merge it for you.

Attribution

These contribution guidelines have been adapted from this template.

Documentation

See Backend Documentation for details on the backend of our project.

See Front End Documentation for details on the front end of our project.