This project work shows I can deal with complex real-life datasets and come up with useful insights. These findings can help shape public policy and strategies to improve communities.
I used my SQL and Python skills to look into and get insights from three different datasets on the City of Chicago's Data Portal. Here are the datasets I worked with:
Socioeconomic Indicators in Chicago: This dataset shows six main socioeconomic indicators that are important to public health, along with a "hardship index" for various Chicago community areas from 2008 to 2012.
Socioeconomic Indicators in Chicago
Chicago Public Schools: This dataset has school-level performance data that was used to create Chicago Public Schools (CPS) Report Cards for the 2011-2012 school year.
Chicago Crime Data: This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days.
-- Lets find the column names and types from Crime Table
%sql SELECT name, type, length(type) FROM PRAGMA_TABLE_INFO('CHICAGO_CRIME_DATA');
-- Query
%%sql
SELECT COUNT(*) AS TOTAL_CRIME_COUNT FROM `CHICAGO_CRIME_DATA`;
-- Lets find the column names and types from Census Table
%sql SELECT name, type, length(type) FROM PRAGMA_TABLE_INFO('CENSUS_DATA');
-- Query
%%sql
SELECT COMMUNITY_AREA_NAME, COMMUNITY_AREA_NUMBER, PER_CAPITA_INCOME
FROM `CENSUS_DATA`
WHERE PER_CAPITA_INCOME < 11000;
Problem 3 - List all case numbers for crimes involving minors?(children are not considered minors for the purposes of crime analysis)
%%sql
SELECT CASE_NUMBER, PRIMARY_TYPE, DESCRIPTION
FROM `CHICAGO_CRIME_DATA`
WHERE LOWER(PRIMARY_TYPE) LIKE '%minor%' OR
LOWER(DESCRIPTION) LIKE '%minor%'
%%sql
SELECT PRIMARY_TYPE, DESCRIPTION
FROM `CHICAGO_CRIME_DATA`
WHERE LOWER(PRIMARY_TYPE) LIKE '%kidnap%' AND
LOWER(DESCRIPTION) LIKE '%child%'
%%sql
SELECT DISTINCT PRIMARY_TYPE, LOCATION_DESCRIPTION
FROM `CHICAGO_CRIME_DATA`
WHERE LOWER(LOCATION_DESCRIPTION) LIKE '%school%'
-- Lets find the column names and types from Public School Table
%sql SELECT name, type, length(type) FROM PRAGMA_TABLE_INFO('CHICAGO_PUBLIC_SCHOOLS');
-- Query
%%sql
SELECT `Elementary, Middle, or High School` AS TYPE_OF_SCHOOL, AVG(SAFETY_SCORE) AS AVG_SAFETY_SCORE
FROM `CHICAGO_PUBLIC_SCHOOLS`
GROUP BY `Elementary, Middle, or High School`
%%sql
SELECT COMMUNITY_AREA_NAME, PERCENT_HOUSEHOLDS_BELOW_POVERTY
FROM `CENSUS_DATA`
ORDER BY PERCENT_HOUSEHOLDS_BELOW_POVERTY DESC
LIMIT 5
%%sql
SELECT COMMUNITY_AREA_NUMBER
FROM `CHICAGO_CRIME_DATA`
WHERE COMMUNITY_AREA_NUMBER IS NOT NULL
GROUP BY COMMUNITY_AREA_NUMBER
ORDER BY COUNT(*) DESC
LIMIT 1
%%sql
SELECT COMMUNITY_AREA_NAME, HARDSHIP_INDEX
FROM `CENSUS_DATA`
WHERE HARDSHIP_INDEX = (
SELECT MAX(HARDSHIP_INDEX) FROM `CENSUS_DATA`
)
%%sql
SELECT COMMUNITY_AREA_NAME
FROM `CENSUS_DATA` As Census
WHERE Census.COMMUNITY_AREA_NUMBER IN
( SELECT COMMUNITY_AREA_NUMBER
FROM `CHICAGO_CRIME_DATA`
GROUP BY COMMUNITY_AREA_NUMBER
ORDER BY Count (*) Desc
LIMIT 1
)
%%sql
SELECT DISTINCT PRIMARY_TYPE, LOCATION_DESCRIPTION
FROM `CHICAGO_CRIME_DATA`
WHERE LOWER(LOCATION_DESCRIPTION) LIKE '%school%'
Those were 10 problem questions, along with SQL queries.
This project was conducted as a part of IBM Certification course at Coursera.