Team member:
- Marwa Eltgani (eltgani@wisc.edu)
- Sheng Huang (shaung452@wisc.edu)
- Samach Sathitvudh (sathitvudh@wisc.edu)
- Xinyan Wang (xwang2587@wisc.edu)
The purpose of this project is to perform a modern statistical analysis from the BODYFAT dataset. Our methodology includes:
- Exploratory Data Analysis: to investigate the variables' distributions along with the outliers
- Data cleansing: to clean the data by imputing some observations which areirrational and create a new cleaned dataset
- Statistical Model: to construct a model that effectively infers and describes the variables based on the existing data. We also conduct the experiment to select the best model
- Model diagnostics: to investigate the proposed model whether it is appropriate for predicting body fat using several plots and tests for model assumptions
- Model strengths and weaknesses: to analyze the finalized model and its prediction performance. We also include the discussion for the improvement and future work
- Shiny App: to develop a real-time web application and predict a bodyfat based on our model
- This contains both raw and cleaned data sets we used in the analysis.
- This contains R scripts for the analysis ranging from reading data, exploratory data analysis, data cleaning, data modeling to model diagnostics as well as script for Shiny app.
- This contains images obtained from the code and analysis such as figures, tables, and plots.
- The two-page document that summarizes the analysis including the introduction, data cleaning, model selection and diagnostics as well as the strengths and weaknesses of the model.