This repository contains my cumulating assignment for the Getting and Cleaning Data course from the Johns Hopkins University Data Science Specialization.
It demonstrates my ability to gather raw data and process it into a tidy dataset, as well as calculating some basic summary statistics on the tidy dataset. The data analysis was performed using the R language.
Included in this repository under the data
directory are:
- The raw data
- The tidy dataset
- The summary statistics
This project contains one main script titled run_statistics.R which completes all the data processing.
This script:
- downloads the data from the web
- Merges data from various files in the raw dataset
- Selects relevant variables from data ("mean" and "std" variables)
- Renames variables with descriptive names
- Saves tidy dataset in
data
directory - Generates summary (mean) statistics based on subject and activity and saves table in
data
directory