/datasciencectacontent

repository for Community TA content related to the Johns Hopkins University Data Science Specialization on Coursera

Primary LanguageRGNU General Public License v3.0GPL-3.0

Data Science Specialization Community TA Content Repository

Author: Len Greski

This repository contains content developed during my time as either a student or Community TA in the Data Science Specialization from Johns Hopkins University that is offered over Coursera.

Repository Contents

As a participant and TA in courses in the curriculum, there are patterns of similar issues experienced by students. Migrating the content to github will facilitate reposting it to new runs of courses within the curriculum. This will make it easier for students to have access to the experiences from prior students without me having to cut and paste the content into Discussion Forums, which are the primary mechanism for communication between students and with TAs.

FileDescription
/markdownDirectory containing markdown files, the primary form of documentation for the content in the repository.
/markdown/imagesDirectory containing portable network graphics files, which are used to illustrate the narrative content in other documentation.
README.mdFile explaining the purpose and contents of the repository, listing of links to specific content by course.

The remainder of this document serves as a directory of the content, aligning individual documents with the course(s) for which the content is relevant.

Course 1: Data Scientist's Toolbox

  1. Configuring RStudio to work with git / github - Mac OSX
  2. Configuring RStudio to work with git / github - Windows 7, 8, and 10
  3. Using Editor Modes in Discussion Forum Posts

Course 2: R Programming

General commentary about the course, R programming in general, and R in relationship to other statistics packages.

  1. Commercial Statistics Packages: An Historical Perspective
  2. Configuring RStudio to work with git / github - Mac OSX
  3. A Data Frame is Also a List
  4. S Objects, R Objects, and Lexical Scoping
  5. Thinking in R versus Thinking in SAS
  6. Strategy for the Programming Assignments
  7. Why is R More Difficult than SAS?

Posts regarding specifics of programming assignments

  1. Assignment 2: makeCacheMatrix as an Object
  2. Assignment 2: Grading the SHA-1 Hash Code
  3. Assignment 3: Functions to Sort Data Frames

Miscellaneous Code Examples

  1. Common R Mistakes: Overwriting R Functions with Output Variables

Course 3: Getting and Cleaning Data

  1. Real World Example: Reading American Community Survey data
  2. Strategy for Reading Files & APIs / Quiz 2

Course 5: Reproducible Research

  1. Assignment 2 Checklist

Course 6: Statistical Inference

  1. Exponential Distribution / Central Limit Theorem - Assignment Checklist
  2. ToothGrowth Analysis - Assignment Checklist
  3. Exploratory Data Analysis in ToothGrowth Assignment, explaining the exploratory data analysis requirement for students who have not taken the Exploratory Data Analysis course prior to taking Statistical Inference.
  4. Using MathJax with Discussion Forums, R Markdown, and Github Pages
  5. Kable Tables with Data Frames illustrates how to display a custom table in a knitr() document by creating a data frame to contain the information to be rendered with kable().

Course 7: Regression Models

  1. Why does sum of errors * X equal 0?
  2. Using MathJax with Discussion Forums, R Markdown, and Github Pages

Course 8: Practical Machine Learning

  1. Course Project - gh-pages Setup with RStudio
  2. Course Project - Improving Runtime Performance of Random Forest Models with caret::train()
  3. Course Project - Predicting Test Scores based on Training Model Accuracy

Course 9: Developing Data Products

  1. Configuring shinyapps.io Application Timeout