This script summarizes average data from a larger data set of human activity recognition on smartphones. The data set comes from an experiment of 30 people, who each performed 6 activities wearing a smartphone on their waist. Test subjects' movement was measured using the smartphones' accelerometer and gyroscope. This script produces a table of averages (by activity and by test subject) of the means and standard deviations of the variables measured in the experiment.
The data from the test subjects was randomly split into training and test datasets. This script joins the training and test datasets together. Averages are from the total data.
The script is part of the course project for Coursera Data Science 3 Getting and Cleaning Data.
The script performs the following tasks:
0.2 Downloading and unzipping the data, if the unzipped folder does not exist in the working directory
(The feature names in the original features.txt-file are not unique.)
- the variable names (features),
- the test subject id data and
- the activity id data Identical procedure is performed then for the test data
- the same variable names as to the training data,
- the test subject id data and
- the activity performed id data
The datasets have been kept identical in width an column names, so this operation is performed by just adding one data set to the end of the other by rbind()
All the variables that have "mean" or "std" in their name are selected. This gives 86 measurements for further analysis.
A small dataset containing the activity ids and names is merged to the large dataset based on activity id. The activity id number is then removed to keep the data more simple.
The label names are cleaned of parentheses and dashes, but retain all the information from the original feature names in the dataset.
The data is grouped first by activity and then by subject. The result is a table of averages of 86 variables. The variables still have their close to original names, but the results are the means of means and standard deviations.
The source of the original data set and experiment is:
Human Activity Recognition Using Smartphones Dataset Version 1.0
http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones
Jorge L. Reyes-Ortiz, Davide Anguita, Alessandro Ghio, Luca Oneto. Smartlab - Non Linear Complex Systems Laboratory DITEN - Universit‡ degli Studi di Genova. Via Opera Pia 11A, I-16145, Genoa, Italy. activityrecognition@smartlab.ws www.smartlab.ws
Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz. Human Activity Recognition on Smartphones using a Multiclass Hardware-Friendly Support Vector Machine. International Workshop of Ambient Assisted Living (IWAAL 2012). Vitoria-Gasteiz, Spain. Dec 2012