This is my assignment for "Getting and Cleaning Data" course at Coursera. This repository contains the following files:
- README.md - gives a short overview about the files in this repository
- tidy_data.txt - is the tidy and clean dataset
- run_analysis.R - is the script that shows how the dataset was created
- codebook.md - gives overview about the data set, variables and transformation that was done during the analysis
The following files from the initial dataset is used:
- features.txt - includes the descriptions for features measured
- train/X_train.txt - includes the measurements of the features in train set (one row - 1 measurement of 561 features)
- test/X_test.txt - includes the measurements of the features in test set
- train/subject_train.txt - subject for each measurement from the train set
- test/subject_test.txt - subject for each measurement from the test set
- train/y_train.txt - activity (from 1 to 6) for each measurement from the train set
- test/y_test.txt - activity (from 1 to 6) for each measurement from the test set
The run_analysis.R script does the following:
- Downloads and unzip the the zip file
- Creates one single dataset from test and train files
- Extracts only the measurements on the mean and standard deviation for each measurement
- Use descriptive activity names to name the activities in the data set.
- Appropriately label the data set with descriptive variable names.
- Create a second, independent tidy set with the average of each variable for each activity and each subject.
- Write the data set to the tidy_data.txt file