Getting and Cleaning Data Project

For this project the following requirements existed:

Merge the training and test sets to create one data set
Extract the measurements on the mean and standard deviation
Use descriptive activity names to name the activities
Appropriately label the data set with descriptive variable names
Create a second independent data set with the average of each variable for each activity and each subject

The script run_analysis.R accomplishes these tasks by performing the following:

Reads in features.txt to get the feature names
Reads in activity_labels.txt to get the activity names
Reads the train data
Reads the subject_train.txt file to get the volunteers part of the training set
Reads y_train.txt to get the activities performed
Translates the activity ids to names
Reads X_train.txt to get feature data and applies names to the columns
Selects only the mean and std columns
Reads the test data
Reads the subject_text.txt file to get the voluneers part of the test set
Reads the y_test.txt to get the activities performed
Translates the activity ids to names
Reads X_test.txt to get the feature data and applies names to the columns
Selects only the mean and std columns
Combines the two data sets and writes it to combined_dataset.txt
Groups and aggregates data based on the volunteer and activity, producing a table of means of all the features for each group
Writes out the new dataset to tidy_dataset.txt

A codebook is available as tidy_dataset_info.md

Running run_analysis.R

The working directory should contain the R script as well as the original dataset as unzipped.
The script requires the libraries the following libraries: reshape, reshape2 and plyr

source("run_analysis.R")