This script will combine the data in the Human Activity Recognition Using Smartphones Data set and output a new file output.csv, which will contain the average of all mean and standard deviation varabiles for each subject and activity.
Read the CodeBook.md file for more information.
-
Please clone the repository and unzip the UCI HAR Dataset into a folder named UCI HAR Dataset within your R working directory.
-
Install the required reshape2 package with
install.packages("reshape2")
-
Run the run_analysis.R script
The script performs the following functions on the raw data described above.
- All training and test data is loaded in, as well as the column and activity labels
- The train and test data are merged with their respective subjects before being combined with eachother
- The columns are labeled using the data found in features.txt, however, these labels are first processed to remove any special characters known to cause issues with subsetting in R
- The activity references are replaced with the labels loaded from activity_labels.txt
- The entire dataset is subsetted to extract only features that are a function of mean or standard deviation
- The subsetted data is then reshaped to take the average of each variable for a given subject and activity
- The final tidy dataset is then exported as output.csv in the working directory