/BeginningDataScience

A repo for materials for engineering PDPs to learn general data science and computer science concepts

Primary LanguageJupyter Notebook

BeginningDataScience

A repo for materials for engineering PDPs to learn general data science and computer science concepts

Breakdown of materials

  1. Command line knowledge
  2. Understanding git
  3. Basic R knowledge (swirl package, general questions)
  4. Basic R projects (use of dplyr, caret, ggplot2) * Use of Kaggle datasets would be good here
  5. Simple R project to demonstrate reactivity concepts. * Needs to have a statistical model that will generate predictions based on user input
  6. Basic SQL knowledge, relational database knowledge
  7. Bonus Parallel computing concepts
  8. Bonus Http protocol, proxies, firewalls, APIs... build a project to obtain API data and display
  9. Bonus How to create R packages

Prereqs

  • You will need R and R studio installed on your device. Note: If doing this on your work device, you will need local admin rights
  • Knowledge of basic statistics
    • Linear Regression, standard deviations, normal distributions, hypothesis testing should be pretty intuitive to you
  • Knowledge of calculus concepts
    • You should be familiar with gradient descent at a minimum
  • Experience in coding and thinking algorithmically
    • Particular language is not important
    • Should be able to pseudocode something like a basic sorting algorithm
    • Know when and why you would want to use the following: if/else statements, for loop, while loop
  • A desire to learn more about data science and expand your data analysis skills!