/accidents_us

Analysis of reported car accidents across the United States between 2016 and 2019. Python: Pandas, Matplotlib, and Plotly.

Primary LanguageJupyter Notebook

Accidents in the United States

Team AAA: Dagney Cooke, Diana Silva, Heain Yee
Project 1 (January 2020); UC Berkeley Data and Analytics Bootcamp

Large Source Data File from Kaggle

The source data file is not included in the github as it is a large file. It can be accessed directly from kaggle (see link below).

This project seeks to answer the following questions about car accidents in the United States.

  1. Have car accidents increased over time?
  2. Where do the highest number of accidents occur, and where do the most severe accidents occur?
  3. When do the highest number of accidents occur?
  4. Under what weather conditions do accidents occur?
  5. Is there any correlation between different traffic infrastructure and frequency of accidents? Between street direction and accidents?

Analysis and Findings

  • You can find the executive summary of our findings on "AAA_ExecutiveSummary" file within this repository.
  • The final presentation is available at this link.
  • Please refer to "AAA_Data_Cleaning" notebook for details on how we prepared our dataset.
  • Please refer to "AAA_Data_Analysis" notebook our analysis.

Sources

Accident Data

link to kaggle

  • Moosavi, Sobhan, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, and Rajiv Ramnath. “A Countrywide Traffic Accident Dataset.”, 2019.
  • Moosavi, Sobhan, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, Radu Teodorescu, and Rajiv Ramnath. "Accident Risk Prediction based on Heterogeneous Sparse Data: New Dataset and Insights." In proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM, 2019.
Other Relevant Data