/USAccidents

:vertical_traffic_light: Traffic accidents analysis using Apache Spark and Google Colab.

Primary LanguageJupyter Notebook

Traffic accidents analysis using Apache Spark and Google Colab

Open In Colab

This project consists in a statistical analysis of a large traffic accidents dataset [1, 2] using Spark.

It has been developed using Google Colab's environment. For this purpose, both Jupyter Notebook and Dataset had been hosted using Google Drive.

You can run the code in my hosted notebook or upload the code and setup your own working environment following the next section steps.

How to setup the working environment

Set the environment performing the following steps:

  1. Create the Colab Notebooks and Colab Datasets folders in your Google Drive space.
  2. Import the USAccidents.ipynb Jupyter Notebook into your Colab Notebooks folder.
  3. Download the USAccidents dataset and import it into your Colab Datasets folder.
  4. Open the Jupyter Notebook in Google Colab.

GitHub's Jupyter Notebook renderer does not display the plots generated by plot.ly.

References

[1] Moosavi, Sobhan, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, and Rajiv Ramnath. “A Countrywide Traffic Accident Dataset.”, arXiv preprint arXiv:1906.05409 (2019).
[2] Moosavi, Sobhan, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, Radu Teodorescu, and Rajiv Ramnath. “Accident Risk Prediction based on Heterogeneous Sparse Data: New Dataset and Insights.” In proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM, 2019.