/Advanced-Data-Wrangling

This assignment involves the preprocessing of two main datasets prior to being merged. The first data set is imported. It has an unused variable removed and another variable renamed. The data set is then parsed for missing values. The identified missing values are replaced or removed using a variety of techniques including mean imputation, ratio replacement, removal, logical assumption replacement and constant value substitution. The second main data set is a binding of two smaller data sets. Both smaller data sets are imported from a large excel document, using specialised import specifications. The data sets are then subsetted to produce the respective desired tables. The subsetted data sets are then cleaned by the removal of blank columns. Once clean the data sets are bound by row. This main dataset then has a variable name changed. Both main data sets have their variable data types scanned and corrected. The two main data sets are then merged to form a grand final data set. The final data set has it's data types double-checked, leading to the factorising and labelling of a variable.

Primary LanguageR

Watchers