INFO 550: Data Science Toolkit

Ian Fowler

Final Project Repository

In 2009 a variant of the seasonal flu spread globally beginning a pandemic. H1N1, also known as "Swine Flu" due to the source of transmission, spread to every region and caused approximately 94481 cases and 429 deaths from April 2009 to July 2010 alone.

This repository contains the analysis in which we will investigate the relationships between geographic region, total number of cases, and total number of deaths associated with the H1N1 pandemic. While the entire pandemic lasted from January 2009 to August of 2010, the data set sourced from Kaggle as provided by the WHO only contains data from April 2009 to July 2010 as after July, countries were no longer required to submit individual level data and the date for after July is not complete.

The repository contains 4 sections:

The Main Project Folder

  • the final_project.rmd
  • the makefile
  • .gitignore file
  • renv directory
  • renv.lock file
  • the destination for the rendered final_project.html

The raw_data Folder

  • the archive folder containing data.csv

The code Folder

  • 00_clean-data.r
  • 01_regions.r
  • 02_region-data.r
  • 03_table1.r
  • 04_fig1.r
  • 05_map1.r
  • render_report.r

The output Folder

  • the destination for clean and segmented code.rds, table1.rds, fig1.png, & map1.png

The final_report Folder

  • the destination for the result of the containerized INFO550_FinalProject_IanFowler.html report render

The report can be built in two ways, automatically through containerized system or locally with installed R systems. To generate the report follow on of the sets of instructions below:

Generating the Report Automatically

Operating the Container

for Mac OS using Intel Chip

The repository includes a Dockerfile and instructions to fetch an image from DockerHub that will allow for the automatic generation of the report through a containerized Ubuntu system.

Entering the "make final_report/INFO550_FinalProject_IanFowler.html" command into the terminal will automatically generate the INFO550_FinalProject_IanFowler.html report in the final_report folder

Additionally, the "make project_image" command will build a replica of the image from DockerHub on your local computer NOTE: this step is not necessary to automatically build the report

Generating the Report Locally

Activating RENV: accessing necessary packages

To download/update the packages necessary to render the report, input the "make install" command into the terminal

Rendering the Report

The code for producing table1.rds, fig1.png, and map1.png are located within the code folder and are labeled 03_table1.r, 04_fig1.r, 05_map1.r respectively

The final_project.html will render if the "make" command is entered into the terminal

The "make clean" command will remove all contents of the output folder and the rendered report