A practical, top-down approach, starting with high-level frameworks with a focus on Deep Learning.
- Spend a week on codecademy.com and learn the python syntax, command line and git.
- Spend 1-2 weeks using Pandas and Scikit-learn on Kaggle problems using Jupyter Notebook. This gives you an overview of the machine learning mindset and workflow.
- Spend 1-month implementing Keras models on cloud GPUs. This gives you a sense of the deep learning mindset and workflow. Start with keras' examples.
- Spend 1 month recoding the core concepts in python numpy, including a The Method of Least Squares, Gradient Descent, Linear Regression, The Perceptron and a vanilla neural network.
By reproducing papers you get a feel for doing actual work in deep learning. I'd recommend reproducing a paper or building a project in the following four areas: CNN, LSTM, GAN, and reinforcement learning(or neuroevolution or neural programming). For the first two areas reimplement student papers. I'd recommend using Keras and reimplementing it from scratch.
The best way to get a feel for the most interesting ideas in Machine Learning is Twitter and Arxiv-sanity. Here is my full list of people I follow on Twitter. These are my favorites: ilyasut, josephreisinger, math_rachel, mustafasuleymn, catherineols, dennybritz, ylecun, jtoy, brohrer, tkasasagi, jeremyjkun, jeffclune, danielgross, karoly_zsolnai, mortendahlcs, Reza_Zadeh, goodfellow_ian, fchollet, michael_nielsen, iamtrask, jeremyphoward, jackclarkSF, ch402, distillpub
Tips for reproducing papers:
- It takes 1-2 months to reproduce a student paper if you work full-time. It takes about 3 weeks to get clarity of the core concepts in each paper.
- Spend the first week reimplementing the core algorithm in python numpy, say an RNN, neural network or CNN.
- Don't follow a step by step tutorial or MOOC. Instead, spend a few days scanning every MOOC and tutorial on the topic. This gives you an index of resources you can later dig deeper in. If you follow a step by step guide, you end up copy-pasting instead of learning anything.
- GPU access is key. My favorite GPU cloud provider is Floydhub. It's hands down the best option. If you have a GPU budget that is less than 100$/month I'd recommend colab.research.google.com. If you have a low budget, yet need a lot of compute, I'd recommend paying 100$ for producthunt.com's subscription, which gives you $7.5K in AWS cloud credit. Another good bet is Google's startup program through one of their partners or apply via AIgrant.
- It's very cognitive demanding to learn deep learning. To feel a sense of progress, I'd recommend scheduling everything you do. Also, have a Pomodoro timer and block all news/notifications/social media.
- You don't need mentors. Having gone through a teaching-heavy education system, we often underestimate our capacity to learn by ourselves. Most Q&A/forums will offer little help in solving your bugs. The best option is to document the problem you are facing in detail, then research all the unknowns. I'd also suggest reaching out to the author of the paper you are reproducing. Again, if you reproduce student papers they are often happy to answer clarifying questions.
- Implementing Keras models (I started with TFlearn, but I'd highly recommend using Keras instead)
- Recoding the core concepts in python
- My first paper using CNN
- My first paper using LSTM
- My first paper using GAN(in progress)
- My first paper using RL/evolution(in progress)
- On going via my twitter feed
If you have suggestions/questions create an issue or ping me on Twitter.