Review Assignment Due Date

group-project-akatsuki

group-project-akatsuki created by GitHub Classroom

Crime Vulnerability in Chicago

This project aims to study the crime rates in the city of Chicago based on weather, population, and socioeconomic variables. The goal is to provide a tool that helps policymakers, law enforcement agencies, and community members identify areas with high crime rates and prioritize resources to reduce crime.

Project Goals

The project will focus on the following research questions:

  1. What is the Crime Vulnerability for the Census Tracts in Chicago based on Social Vulnerability Index, Climate, Behavior, Health and Air Pollution?
  2. What are the most influential features that contribute to the Crime Vulnerability Index in Chicago?
  3. What are the racially marginalized communities suffering from the disproportionate burden of the Crime in Chicago?

Data Sources

The following data sources will be used:

  • Crime data from the Chicago Police Department
  • Social Vulnerability Index data from the Centers for Disease Control and Prevention (CDC)
  • Climate data from MODIS, WorldClim, SOLARGIS, and Global Wind Atlas
  • Behavior data from the Chicago Department of Public Health (CDPH)
  • Health data from the CDC
  • Air Pollution data from the US Environmental Protection Agency (EPA)

Methodology

The project will use machine learning techniques to analyze the data and answer the research questions. The following tasks will be performed:

  • Data cleaning and preprocessing
  • Exploratory data analysis
  • Feature selection and engineering
  • Modeling and evaluation
  • Interpretation and communication of results

ML Models

The following machine learning models will be used to answer the research questions:

Question 1: Crime Vulnerability

  • Regression models: Linear Regression, Random Forest Regression, Gradient Boosting Regression

Question 2: Most Influential Features

  • Feature Importance models: Random Forest, Gradient Boosting, XGBoost

Question 3: Racially Marginalized Communities

  • Classification models: Logistic Regression, Random Forest Classifier, Support Vector Machine Classifier

Results

The results of the analysis will be presented in a report and a visualization dashboard that can be used to explore the crime patterns in Chicago and identify areas of high vulnerability. The code and data will be made available on a GitHub repository.

Team Members

  • Nikita Thakur (GitHub: @nikitasthakur)
  • Karan Jogi (GitHub: @karanjogi)