by
Amairani Garcia, Christian Bourdeau, Shan Huang
Research of traffic collisions for the City of Los Angeles from 2010 to 2019 and create data visualizations using the records from the dataset.
- Who?: the demographic distribution of collision victims
- When?: the time distribution of collisions
- Where?: the relationship of location in which collisions occured
Technology | Description |
---|---|
Github | HTML, CSS, AWS |
API's | data.lacity.org, google |
Python Libraries | Python, Pandas, Matplotlib, Seaborn, scipy.stats, numpy, seaborn, gmap |
Supporting functions | Sodapy (library), datecal, datetime, calendar, Rise (library) |
- Use Pandas to clean and format your dataset(s).
- Create a Jupyter Notebook describing the data exploration and cleanup process.
- Create a Jupyter Notebook illustrating the final data analysis.
- Use Matplotlib to create a total of 6–8 visualizations of your data (ideally, at least 2 per ”question” you ask of your data).
- Save PNG images of your visualizations to distribute to the class and instructional team, and for inclusion in your presentation.
- Use at least one API, if you can find an API with data pertinent to your primary research questions.
- Create a write-up summarizing your major findings. This should include a heading for each “question” you asked of your data and a short description of your findings and any relevant plots.
- 10-minute project overview
- Questions you found interesting and what motivated you to answer them
- Where and how you found the data you used to answer these questions
- The data exploration and cleanup process (accompanied by your Jupyter Notebook)
- The analysis process (accompanied by your Jupyter Notebook)
- Your conclusions, which should include a numerical summary and visualizations of that summary
- The implications of your findings: what do your findings mean?