A repo for materials for engineering PDPs to learn general data science and computer science concepts
- Command line knowledge
- Understanding git
- Basic R knowledge (swirl package, general questions)
- Basic R projects (use of dplyr, caret, ggplot2) * Use of Kaggle datasets would be good here
- Simple R project to demonstrate reactivity concepts. * Needs to have a statistical model that will generate predictions based on user input
- Basic SQL knowledge, relational database knowledge
- Bonus Parallel computing concepts
- Bonus Http protocol, proxies, firewalls, APIs... build a project to obtain API data and display
- Bonus How to create R packages
- You will need R and R studio installed on your device. Note: If doing this on your work device, you will need local admin rights
- R: https://cran.r-project.org/
- R Studio: https://www.rstudio.com/ (You want R Studio Desktop, the free version)
- Jupyter Notebooks: http://jupyter.org/
- Python >= 3.5: https://www.python.org/
- Knowledge of basic statistics
- Linear Regression, standard deviations, normal distributions, hypothesis testing should be pretty intuitive to you
- Knowledge of calculus concepts
- You should be familiar with gradient descent at a minimum
- Experience in coding and thinking algorithmically
- Particular language is not important
- Should be able to pseudocode something like a basic sorting algorithm
- Know when and why you would want to use the following: if/else statements, for loop, while loop
- A desire to learn more about data science and expand your data analysis skills!