/gcproj

Primary LanguageR

Getting/cleaning data course project, authored by Philip N. Brown

How to run my code

Execute the run_analysis.R script. The script does not download the data; it merely assumes that the files are present in the same directory as the script and loads them accordingly. The script does not call any other files, and should import the required libraries. If everything works properly, the script will create a file tidy_data.txt in the working directory representing the desired tidy dataset.

How the code works

  • The first few lines of code check if the data files have been read into memory yet, and if they do not, they are read in.
  • The actual data is read into variable X
  • The variable names are read into variable feat
  • Activity names are read into act
  • Next are some function definitions I found useful
  • isMean() and isStd() check if a feature name is a mean or standard deviation
  • make.readable() takes a raw feature name and creates a human-readable descriptive string
  • Next, we extract out the data from X corresponding to the mean and standard deviation measurements
  • Next we use a series of auxiliary variables X1, X2, and X3 to
  • Add on the descriptive activity names generated by make.readable()
  • Include a column of test subjects (i.e., who performed which activities)
  • Include a column of activity names for each datapoint
  • Finally, we use dplyr functions to make the final tidy dataset
  • we group_by() to organize the data by subject and activity
  • we summarise_each() to compute averages for each subject, activity, and variable
  • The last line of code uses write.table() to output the tidy data set created by summarise_each()