Here's a summary of the key actions taken in the script:
-
Data is filtered to include records where the "continent" field is not null and sorted by the third and fourth columns.
-
A query is used to select specific columns for analysis, including location, date, total cases, new cases, total deaths, and population, ordered by location and date.
-
The script calculates the death percentage by location, dividing total deaths by total cases and converting the result into a percentage. It filters records where the location contains "states" and orders by location and date.
-
The percentage of the population affected by COVID-19 is computed by dividing total deaths by population and converting the result into a percentage.
-
The countries with the highest infection rates relative to their population are identified by calculating the percentage of the population affected.
-
The countries with the highest death counts relative to their population are identified.
-
The continents with the highest death counts are identified.
-
Global numbers for total cases, total deaths, and death percentages are calculated, grouped by date.
-
Data related to population vs. vaccinations is selected and aggregated, showing rolling people vaccinated.
-
A Common Table Expression (CTE) is used to perform the same operation with the benefit of a cleaner, reusable query.
-
A temporary table,
#PercentPopulationVaccinated
, is created to store data temporarily and calculate the percentage of the population vaccinated. -
A view named
PercentPopulationVaccinated
is created to store data for later visualization.
This SQL script is designed for in-depth data analysis and visualization in the context of COVID-19 data, combining information from the CovidDeath
and CovidVaccination
tables.