Self-study plan to achieve zero to mastery in data science.
- Score top 20% in Kaggle competitions
- Expert with different data types (text, image, audio, video)
- Expert with different techniques (regression, SVM, deep learning, genetic algorithms, etc)
- Familiar with modern tooling (python, pandas, scikit, R, tensorflow, apache spark, etc)
- Expert with various problems (classification, search, clustering, prediction, recommendation, etc)
- fundamentals (able to read and implement technical papers)
- building at scale pipelines / architectures
- Module 0 - Highschool Math
- Module 1 - College Math I (Calculus)
- Module 2 - College Math II (Linear Algebra)
- Module 3 - College Math III (Discrete Math)
- Module 4 - College Math IV (Probability and Statistics)
- Module 5 - Computation and Algorithms
- Module 6 - Artificial Intelligence and Machine Learning
- Module 7 - Deep Learning
- Module 8 - Data Mining and Recommenders
- Module 9 - NLP and Computer Vision
- Module 10 - Cloud Computing Architectures / Data Center Engineering
It is recommended to look ahead so long as the general trend is that of finishing earlier blocks before later blocks.
Not everyone was lucky enough to have a good start with math growing up. The goal is to level the playing field - by the end of Block 0 you should feel like you went to a highschool with world class teachers and finished top of your math class.
- Khan - Pre-Algebra - 100%
- Khan - Algebra I - 100%
- Khan - Algebra II - 100%
- Khan - Geometry - 100%
- Khan - Trigonometry - 100%
- Khan - Pre-Calculus - 100%
- Khan - Highschool Statistics - 100%
Required Reading
- π The Joy of X
- Khan - Differential Calculus - 8%
- Khan - Integral Calculus - 7%
- Coursera - Mathematics for Machine Learning: Multivariate Calculus
- Khan - AP Calculus AB - 2%
- Khan - AP Calculus BC - 2%
- MIT - Single Variable Calculus
- Coursera - Introduction to Complex Analysis
Supplementary Material
- Khan - Linear Algebra
- Coursera - Mathematics for Machine Learning: Linear Algebra
- Coursera - Mathematics for Machine Learning: Principle Component Analysis
- Fast AI - Computational Linear Algebra
- MIT - Linear Algebra
Required Reading
Supplementary Material
- Matrix Calculus for Deep Learning
- Graphical linear algebra
- Essence of Linear Algebra
- Brown University - Coding the Matrix
- Udacity - Linear Algebra Refresher Course
Proofs, Set theory, propositional logic, induction, invariants, state-machines
- Coursera - What is a Proof?
- MIT - Mathematics for Computer Science (2015): Unit 1
- MIT - Mathematics for Computer Science (2010): Weeks 1,2,3
- π How to Prove It
- π Book of Proof
- Coursera - Introduction to Graph Theory
- Coursera - Solving the Delivery Problem
- π Introduction to Graph Theory
- Sarada Herke - Graph Theory Course
- Coursera - Combinatorics and Probability
- Coursera - Introduction to Enumerative Combinatorics
- Coursera/Princeton - Analytic Combinatorics
Supplementary Material
- MIT - Mathematics for Computer Science (2017)
- MIT - Mathematics for Computer Science (2015)
- MIT - Mathematics for Computer Science (2010)
- Arsdigita University - Discrete Mathematics
- Coursera - Discrete Mathematics
- Discrete Stochastic Processes
- Khan - AP Statistics
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-041sc-probabilistic-systems-analysis-and-applied-probability-fall-2013/unit-i/
- Coursera/Duke Uni - Introduction to Probability and Data
- Coursera/Duke Uni - Inferential Statistics
- Coursera/Duke Uni - Linear Regression and Modeling
- Coursera/Duke Uni - Bayesian Statistics
- Coursera/Duke Uni - Statistics with R Capstone
- Edx/Uni Texas - Foundations of Data Analysis - Part 1: Statistics Using R
- EdX/Uni Texas - Foundations of Data Analysis - Part 2: Inferential Statistics
- EdX/MIT - Introduction to Probability - The Science of Uncertainty
- EdX/MIT - Introduction to Probability Part II - Inferences and Processes
-
https://www.edx.org/course/introduction-computer-science-mitx-6-00-1x-10
-
https://www.edx.org/course/introduction-computational-thinking-data-mitx-6-00-2x-5
- Introduction
- Algorithmic thinking, peak finding
- Models of computation, Python cost model, document distance
- Lecture 1: Algorithmic Thinking, Peak Finding
- Recitation 1: Asymptotic Complexity, Peak Finding
- Lecture 2: Models of Computation, Document Distance
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/recitation-videos/recitation-2-python-cost-model-document-distance
Resources π Introduction to Algorithms (CLRS)
- Algorithms I - Divide and Conquer
- Algorithms II - Graph Search, Shortest Path, and Data Structures
- Algorithms III - Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming
- Algorithms IV - Shortest Paths Revisited, NP-Complete Problems
- Intro to Algorithms
- Algorithmic Thinking I
- Algorithmic Thinking II
- https://www.youtube.com/watch?v=T_WffoMAaMA
- https://www.coursera.org/specializations/data-structures-algorithms
- https://www.youtube.com/user/mycodeschool
- https://www.youtube.com/watch?v=ufj5_bppBsA&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&index=7
- https://www.youtube.com/user/mikeysambol/playlists
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/
https://www.coursera.org/specializations/aml
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-868j-the-society-of-mind-fall-2011/video-lectures/
- https://www.youtube.com/watch?feature=player_embedded&v=J6PBD-wNEDs
- http://ai.berkeley.edu/lecture_videos.html
- https://www.udacity.com/course/artificial-intelligence-for-robotics--cs373
- http://aiplaybook.a16z.com/
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-034-artificial-intelligence-fall-2010/lecture-videos/
- http://rll.berkeley.edu/deeprlcourse/
Machine Learning Specialization by University of Washington on Coursera
- Machine Learning Foundations: A Case Study Approach
- Machine Learning: Regression
- Machine Learning: Classification
- Machine Learning: Clustering & Retrieval
- https://www.analyticsvidhya.com/blog/2015/07/top-youtube-videos-machine-learning-neural-network-deep-learning/
- Statistical Machine Learning 10-702/36-702
- https://www.udacity.com/ai
- https://www.udacity.com/drive
- https://www.udacity.com/course/machine-learning-engineer-nanodegree--nd009
- https://www.edx.org/xseries/data-science-engineering-apacher-sparktm
- https://www.coursera.org/specializations/data-mining
- https://www.coursera.org/specializations/machine-learning
- http://web.stanford.edu/class/cs20si/syllabus.html
- https://work.caltech.edu/telecourse.html
- https://work.caltech.edu/telecourse.html
- https://www.youtube.com/watch?v=bxe2T-V8XRs
- https://www.youtube.com/watch?v=UVwwYZMFocg&list=PLiaHhY2iBX9ihLasvE8BKnS2Xg8AhY6iV&index=8
- https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-868j-the-society-of-mind-fall-2011/video-lectures/
- https://www.coursera.org/specializations/gcp-data-machine-learning
- Neural Networks and Deep Learning
- Improving Deep Neural Networks: Hyperparameter Tuning, Regularization, and Optimization
- Structuring Machine Learning Projects
- Convolutional Neural Networks
- Sequence Models
Goals:
- different activation functions (sigmoid/tanh/relu)
- different cost functions
- with and without bias units
- classification and regression problems
- text / binary / image / recommenders
- batch vs stochastic
- JS, Python, PHP, Matlab, TensorFlow, SciKitLearn
- create visualizations and blog explanations
- Audit best courses / books
- Practical Deep Learning For Coders
- https://classroom.udacity.com/courses/ud730
- http://neuralnetworksanddeeplearning.com/
- http://course.fast.ai/
- http://www.deeplearningbook.org/
- http://cs231n.github.io/ + https://www.youtube.com/playlist?list=PLlJy-eBtNFt6EuMxFYRiNRS07MCWN5UIA
- http://neuralnetworksanddeeplearning.com/
- https://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH
- http://rll.berkeley.edu/deeprlcourse/
- http://rll.berkeley.edu/deeprlcourse/#lecture-videos
- http://rll.berkeley.edu/deeprlcourse/
- http://introtodeeplearning.com/index.html
- https://www.youtube.com/watch?v=21EiKfQYZXc&app=desktop
- https://courses.csail.mit.edu/6.042/spring17/mcs.pdf
- http://yerevann.com/a-guide-to-deep-learning/
- https://www.coursera.org/learn/neural-networks
- https://www.youtube.com/playlist?list=PLE6Wd9FR--EfW8dtjAuPoTuPcqmOV53Fu
- https://cloud.google.com/blog/big-data/2017/01/learn-tensorflow-and-deep-learning-without-a-phd
- https://www.udacity.com/course/deep-learning--ud730
- http://nbviewer.jupyter.org/github/domluna/labs/blob/master/Build%20Your%20Own%20TensorFlow.ipynb
- https://goc.vivint.com/problems/mlc
- http://blog.floydhub.com/coding-the-history-of-deep-learning/
- https://www.udacity.com/course/deep-learning--ud730
- https://stats385.github.io/
-
https://www.coursera.org/specializations/recommender-systems
-
https://nlp.stanford.edu/IR-book/information-retrieval-book.html
-
https://www.coursera.org/specializations/gcp-data-machine-learning
- https://github.com/oxford-cs-deepnlp-2017/lectures
- https://www.youtube.com/watch?v=OQQ-W_63UgQ&list=PL3FW7Lu3i5Jsnh1rnUwq_TcylNr7EkRe6
- https://www.coursera.org/learn/digital/home/welcome
- http://cs231n.stanford.edu/syllabus.html
- https://www.udacity.com/course/interactive-3d-graphics--cs291
- https://www.youtube.com/watch?v=01YSK5gIEYQ&list=PL_w_qWAQZtAZhtzPI5pkAtcUVgmzdAP8g
- https://www.coursera.org/specializations/data-warehousing
- https://www.coursera.org/specializations/big-data-engineering
- https://www.coursera.org/specializations/gcp-architecture
- https://www.coursera.org/specializations/gcp-data-machine-learning
- https://www.coursera.org/specializations/cloud-computing
- https://www.coursera.org/specializations/data-science
- https://www.coursera.org/specializations/big-data
- https://www.coursera.org/specializations/scala
- https://www.coursera.org/learn/hadoop
- http://cagd.cs.byu.edu/~557/text/ch1.pdf
- https://www.coursera.org/learn/data-driven-astronomy
- https://www.coursera.org/specializations/genomic-data-science
- https://www.coursera.org/learn/data-genes-medicine
- https://www.coursera.org/specializations/systems-biology
- https://www.coursera.org/specializations/networking-basics
- https://www.coursera.org/learn/neurohacking
- https://www.youtube.com/playlist?list=PLUl4u3cNGP62K2DjQLRxDNRi0z2IRWnNh
Recommender, chatbot, graphics simulation with AI (e.g. ball and paddle), ...
-
https://www.youtube.com/playlist?list=PLoROMvodv4rMWw6rRoeSpkiseTHzWj6vu&disable_polymer=true
-
computational geometry https://www.youtube.com/watch?v=rho8QqiHOe4
-
kaggle school https://www.kaggle.com/learn/overview
-
MIT self driving https://selfdrivingcars.mit.edu/
-
MIT GAI https://agi.mit.edu/
- The Art of Unix Programming
- The C programming language
- GΓΆdel, Escher, Bach: An Eternal Golden Braid
- Compilers: Principles, Techniques, and Tools (dragaon book)
- Code
- The elements of statistical learning
- The structure and intepretation of computer programs
- Hackers Delight
- Concrete Mathematics
- The Art of Computer Programming
- Artificial Intelligence: A Modern Approach