
Getting and cleaning data assignment

Primary LanguageR

Getting and cleaning data assignment

The run_analysis.R script reads datasets which contain accelerometer measurements from a Samsung Galaxy S smartphone, and outputs an independent tidy data set with the average of each variable for each activity and each subject from the original data.

It requires the input datafiles:

  • X_train.txt
  • subject_train.txt
  • y_train.txt
  • features.txt
  • X_test.txt
  • subject_test.txt
  • y_test.txt

to be in the same working directory where the script is. The files can be downloaded in an archived format from the following link: https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip

The script

  • Reads in the relevant training and testing data.
  • Adds the subjects and acitivities and varibale names to the training and test measurements.
  • Binds together the training and test dataset.
  • Selects the variables which contiain the mean and std_dev of the measurements.
  • Converts the activities to factor type and decodes the activity ids into meaningful descriptions.
  • Creates a dataset with the average of each variable for each activity and each subject.
  • Formatting the final dataset to make it comply with the rules required for tidy data.
  • Writing the resulting dataset to a file.