Curriculum, Training, Certification, Hiring Guide for Data Science Machine Learning
Note: DS standardization effort https://www.iadss.org/educational-programs-map
Data Science Workflow is an areas that can potentially bring together pure vs. applied, general vs. specialized, and interdisciplinary topics in a pragmatic project based context for getting the scope of how a Data Science curriculum should be balanced:
https://docs.google.com/document/d/1Ib_CNXrukZ29A5fVodpH2xaUfOmjjhzK6eesmxjuDsg/edit?usp=sharing
Possible Minds: Twenty-Five Ways of Looking at AI (Topic: History & Future of AI) by John Brockman - editor, et al. https://www.amazon.com/Possible-Minds-audiobook/dp/B07MQX54TW/
Artificial Intelligence: A Guide for Thinking Humans (Topic: History & Future of AI) by Melanie Mitchell Pelican (October 15, 2019) https://www.amazon.com/Artificial-Intelligence-Guide-Thinking-Humans/dp/0241404827/
Deep Learning (Adaptive Computation and Machine Learning series) (Deep Learning) by Ian Goodfellow , Yoshua Bengio , et al. https://www.amazon.com/Deep-Learning-Adaptive-Computation-Machine/dp/0262035618/
Rebooting AI (Topic: Comparing AI model performance) by Gary Marcus, Ernest Davis, et al. https://www.amazon.com/Rebooting-AI-Building-Artificial-Intelligence/dp/052556604X
An Introduction to Statistical Learning: with Applications in R (Springer Texts in Statistics Book 103) (Standard Traditional Textbook for ~ non-deep-learning 'machine learning') by Gareth James , Daniela Witten , et al. | Jun 24, 2013 https://www.amazon.com/Introduction-Statistical-Learning-Applications-Statistics/dp/1461471370
Chris Albon Machine Learning Flash Cards https://machinelearningflashcards.com/
https://docs.google.com/document/d/1dDF40M5JjjrBsYYQbJplz3M738ktQBBYyNa6FXhzNFU/edit?usp=sharing
- General Curriculum Guidelines and Standards
- Curriculum Tools for Educators
- Curriculum Tools for Students
- Curriculum Tools for Employers
- Curriculum Content Maps
- Curriculum Teaching Method Standards
- Certification
- DS ML Etc Specialization Areas
- github
- html
- linear algebra
- markdown (e.g. text display in github)
- python (python3)
- functional programming
- terminals (linux/posix/unix/MacOS)
- "Notebooks" (Jupyter Notebooks, Colab Notebooks, For: python, scala)
- text editors
- Code Development Environments / Kits: IDE/IDK
- Environment Management
- Command line Process: Bash etc & Unix/Posix
- Networks
- Deployment
- Dashboarding
- Hypothesis Testing
- Math, Statistics/Econometrics, Probability, Information Theory
- DS Etc. Workflow
- Portfolio
- Linear Models
- Deep Learning
- Practical Programming
- Computer Science Principles
- History of Computation
- History of "Data Science" AI etc.
- Application Frameworks (Six Sigma, Lean, Agile, SCRUM)
- Interdisciplinary Studies: Biological Neurons, Neural Networks & Plasticity
- Presentation and Blogging Skills
https://towardsdatascience.com/why-you-shouldnt-be-a-data-science-generalist-f69ea37cdd2c
(Note: Generalization is still valued, especially in small startups and for Agile-using-generalists(which is the original Agile system))
- R (academic)
- Python (general)
- Spark (distributed)
- C (robotics)
-
- Data Engineering / Big Data Pipeline Engineering
-
- Data Analysis / Data Analytics
-
- Machine Learning Engineering
- SQL & Databases
- "Data Mining" (seems to be an older pre"DS" term)
- Software Engineering
- Linear Specific Machine Learning
- Statistical Analysis & Hypothesis Testing
- Neural Networks
- Biology / Medical (Genetics)
- Banking & Finance
- NLP https://docs.google.com/document/d/19v8jMx60QTWfyRkp6VThJeiSfXNWqadB50FQCzaAOVM/edit?usp=sharing
- Computer Vision
- Time Series (forecasting)
- GIS (raster data)
- Distributed Data Science (Federated Learning)
- Big-Data Data-Science (Spark vs. Pandas)
- SQL
- various quasi-SQL (like HiveQL)
- Various No-SQL
- data engineering vs. analytics vs. AI models
- Spark
- Project Management
- Meetings
- Presentations
- Reports
- Emails
- Office Suites
- Databases
- Cellular Automata
- Genetic Algorithms
- Expert Systems
- Decision Trees
- Ensemble Models
- "Subsymbolic" AI
AI, Deep Learning, Machine Learning, Statistical Learning, Business-Intelligence, Data Mining, Statistics, Data Analysis, Hypothesis or A/B Testing, Perceptrons, Neurons, Neural Networks, Hidden Layers
- History
- Types
- Ensembles
- Hyperparameters
- Activation Functions
- Analogy
- Geofencing
- Turing Tests
- SQuAD
- Parameters & Coefficients in Parametric Models
- Logistic Regression
- Linear Regression
- Sum of Squared Residuals
- R^2
- P
https://docs.google.com/document/d/19v8jMx60QTWfyRkp6VThJeiSfXNWqadB50FQCzaAOVM/edit?usp=sharing
- Box Of Words
- Baseline
- Dimensionality (e.g. the curse of dimensionality)
- The Confusion Matrix
- Weka http://old-www.cms.waikato.ac.nz/~ml/weka/
- R studio
- Anaconda
- Spyder
- Jupeter Lab
- https://teachablemachine.withgoogle.com/