/Statistical-Analysis-of-terminated-recalls

In this project, I wrote R codes to conduct hypothesis test of two population proportions on the terminated food recalls dataset to see if the year 2021 has less cases of terminated food recalls than the previous year.

Primary LanguageR

[Note: you can preview files that are in PDF, R, and CSV format by clicking on the file]

Software I used:

RStudio

Description:

  • In this project, I was in charge of writing codes to analyze food recalls dataset extracted from the FDA website by conducting exploratory data analysis and hypothesis test of two population proportions to determine if the year 2021 has less percentage of terminated food recalls than the previous year. If it does, then it means that the FDA has done a good job at implementing food recalls regulations and the manufacturers have made efforts to correct their violated products, which resulted in fewer cases of food recalls and hence, fewer cases of terminated food recalls. I was also in charge of formatting the written report, delegating tasks to finalize the report, and helping my teammates analyze the dataset.

  • The CSV file named recallsEDA was made to convert yes/no variables into dummry variables (0 and 1) in order to conduct exploratory data analysis (EDA), the CSV file recallsEDApie was also made to create a pie chart for EDA.