Traffic Incident Analysis in NYC and LA

University of Colorado Boulder

Machine Learning, Spring 2024 - Dr. Ami Gates

Overview

This project addresses the critical issue of traffic incidents, leveraging extensive data from New York City and Los Angeles. Given the global prevalence of traffic accidents—with an estimated 1.47 billion vehicles on the road and thousands of incidents occurring every minute—this analysis is both timely and essential. The United States alone reports approximately 61 million traffic accidents annually, including 50,000 fatal crashes. This project aims to dissect the patterns and underlying factors of traffic incidents in these metropolitan areas to contribute to more effective management and prevention strategies.

Objective

The primary goal is to analyze traffic accidents and collisions in New York City and Los Angeles, cities that report between five hundred thousand to one million traffic incidents over a decade. By employing statistical methods, machine learning, and deep learning techniques, we aim to identify patterns, conduct hypothesis testing, and infer insights that can guide towards reducing traffic-related incidents.

Data Source

The analysis is grounded in open city databases from New York and Los Angeles, offering a rich dataset for exploring crime and traffic incidents. This data is crucial for understanding the dynamics at play in these urban environments.

Methodology

Statistical Analysis: Perform hypothesis tests and inferences to understand the distribution and common factors in traffic incidents.
Machine Learning and Deep Learning: Utilize clustering, association rule mining, and supervised learning techniques to uncover patterns and draw comparisons between New York City and Los Angeles.
Pattern Recognition: Identify and compare patterns to propose solutions that could lead to a near-accurate analysis of traffic incidents.

Technologies

Statistical Analysis Tools
Python for Data Analysis and Machine Learning
Machine Learning Libraries (e.g., scikit-learn, TensorFlow)

Project Structure

data/: Contains the datasets used for analysis.
notebooks/: Jupyter notebooks with exploratory data analysis and modeling.
src/: Source code for the analysis and model training.
results/: Visualizations and results from the analysis.

Getting Started

(Provide instructions on how to set up the project, install dependencies, and run the analysis.)

Contributing

We welcome contributions from the community. Please refer to the contributing guidelines for more information on how to get involved.

License

(Include license information, if applicable.)

Acknowledgments

(Any acknowledgments to data providers, contributors, etc.)

harshith-nikhil/Traffic-Analysis