============================================================================================= Information about the original dataset
Human Activity Recognition Using Smartphones Dataset Version 1.0
Jorge L. Reyes-Ortiz, Davide Anguita, Alessandro Ghio, Luca Oneto. Smartlab - Non Linear Complex Systems Laboratory DITEN - UniversitĂ degli Studi di Genova. Via Opera Pia 11A, I-16145, Genoa, Italy. activityrecognition@smartlab.ws www.smartlab.ws
The experiments have been carried out with a group of 30 volunteers within an age bracket of 19-48 years. Each person performed six activities (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING) wearing a smartphone (Samsung Galaxy S II) on the waist. Using its embedded accelerometer and gyroscope, we captured 3-axial linear acceleration and 3-axial angular velocity at a constant rate of 50Hz. The experiments have been video-recorded to label the data manually. The obtained dataset has been randomly partitioned into two sets, where 70% of the volunteers was selected for generating the training data and 30% the test data.
The sensor signals (accelerometer and gyroscope) were pre-processed by applying noise filters and then sampled in fixed-width sliding windows of 2.56 sec and 50% overlap (128 readings/window). The sensor acceleration signal, which has gravitational and body motion components, was separated using a Butterworth low-pass filter into body acceleration and gravity. The gravitational force is assumed to have only low frequency components, therefore a filter with 0.3 Hz cutoff frequency was used. From each window, a vector of features was obtained by calculating variables from the time and frequency domain.
For each record it is provided:
- Triaxial acceleration from the accelerometer (total acceleration) and the estimated body acceleration.
- Triaxial Angular velocity from the gyroscope.
- A 561-feature vector with time and frequency domain variables.
- Its activity label.
- An identifier of the subject who carried out the experiment.
============================================================================================= Information about the analysis files submitted
The tidy data set comes with three files:
- run_analysis.R: script for performing the analysis.
- CodeBook.md: code book that describes the variables, the data, and any transformations or work that were performed to clean up the data.
- README.md: This file that explains what the analysis files did.
============================================================================================= Information about the analysis steps in "run_analysis.R"
(This information is also available together with r code in "run_analysis.R")
1.Merges the training and the test sets to create one data set.
- Read in subject/X/y from the training set; combine them into 1 data frame called "data.train".
- Read in subject/X/y from the test set; combine them into 1 data frame called "data.test".
- Combine the training and test data frames into 1 data frame called "data"; delete other temporary data frames.
2.Extracts only the measurements on the mean and standard deviation for each measurement.
- Read in the names of the selected features; use these names to update the colnames of "data".
- Select only measurements on the mean and standard deviation; subtract these measurements to generate a new data frame called "data2" ('Subject' and 'Activity' are also subtracted for "data2").
3.Uses descriptive activity names to name the activities in the data set.
- Read in the activity number and names.
- Use actual names to replace the numbers for activities in "data2".
4.Appropriately labels the data set with descriptive variable names.
- Get the original variable names from "data2".
- Replace the abbreviations with full words; delete the additional symbols; seperate words with a dot.
- Use the updated variable names to rename the variables in "data2".
5.Creates a second, independent tidy data set with the average of each variable for each activity and each subject.
- Split the measurements data of 'data2' by activity and subject;
- get the average of each variable for each activity and each subject;
- create a new data frame called "data3" to store these averaged values.
- The row names of "data3" are actually "subject"."activity";
- split the row names back to "subject" and "activity".
- Add the variables "subject" and "activity" back to "data3";
- update their colnames and delete the rownames.
- Output 'data3' to a table file as the final tidy data set.