Data-Engineering-Platforms

dep

Data

Group3.final_data_sql_count.csv - cleaned data for import of counts for variables that were previously in rate form. This was the dataset used for analysis.

Group3.raw_data.xlsx - The raw data file as it was downloaded from the USDA online data repository.

Group3.clean_final_data_with_rates.csv - cleaned data for imports before several key variables were converted to counts from rates, per capita units, etc.

Group3.clean_state_sql.csv - cleaned data for all state-related variables.

Group3.clean_county_sql.csv - cleaned data for all county-related variables.

EER_Diagram

Group3.dimensional_model.pdf - The EER diagram of the star dimensional model

Group3.ER-like_model.pdf - A diagram of an ER-like model used for additional data analysis. This is not a true ER model and is not an optimized database. Although, this data model did prove useful for some analysis tasks and helped the group learn about how modeling databases in different ways can serve different use/business cases.

Other (empty)

Presentation

Group3.final_presentation.pptx - The final presentation slides in PPTX form. The presentation describes the work related to the group project and the resultant analysis.

R_Scripts

data_prep.R - R script used for data cleaning percentage.R - R script used to convert rates to counts

SQL_Scripts

Group3.DDLscriptfinal.sql - script used for creation of nutrition database tables Group3.DMLscriptfinal.sql - script used for import of data into nutrition database tables Group3.SQLqueriesforanalysis.sql - script for pulling queries in SQL to develop insights

Tableau

Group3.TableauDashboard1 - race and access to store visualization, as well as store counts

Group3.TableauDashboard2 - a repository used for all other visualizations and data exploration not included in Group3.TableauDashboard1. Most visualizations from the presentation were sourced from this file.