Deep Learning Resources

Books

[PDF] Yann LeCun et al. A Cookbook of Self-Supervised Learning. 2023

Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning. Yet, much like cooking, training SSL methods is a delicate art with a high barrier to entry. While many components are familiar, successfully training a SSL method involves a dizzying set of choices from the pretext tasks to training hyper-parameters. Our goal is to lower the barrier to entry into SSL research by laying the foundations and latest SSL recipes in the style of a cookbook. We hope to empower the curious researcher to navigate the terrain of methods, understand the role of the various knobs, and gain the know-how required to explore how delicious SSL can be.

[PDF] Simon J. D. Prince. Understanding Deep Learning. 2023

Deep Learning provides an authoritative, accessible, and up-to-date treatment of the subject, covering all the key topics along with recent advances and cutting-edge concepts. From machine learning basics to advanced models, each concept is presented in lay terms and then detailed precisely in mathematical form and illustrated visually.

[PDF] Dan Roberts & Sho Yaida. The Principles of Deep Learning Theory. 2021

"The Principles of Deep Learning Theory: An Effective Theory Approach to Understanding Neural Networks" is a collaboration between Sho Yaida of Facebook AI Research, Dan Roberts of MIT and Salesforce, and Boris Hanin at Princeton. At a fundamental level, the book provides a theoretical framework for understanding DNNs from first principles. This book will be published by Cambridge University Press in early 2022 and the manuscript is now publicly available.

[PDF] Francis Bach. Learning Theory from First Principles. 2021

This draft textbook is extracted from lecture notes from a class taught online during Fall 2020, with an extra pass during Spring 2021. The goal is to present old and recent results in learning theory, for the most widely-used learning architectures. This class is geared towards theory-oriented students or students who want to acquire a basic mathematical understanding of machine learning algorithms.

[PDF] Shlomo Kashani & Amir Ivry. Deep Learning Interviews. 2021

Hundreds of fully-solved problems, designed to provide an overview of the field of AI, and rehearse interview topics. Available at https://www.interviews.ai/

[PDF] [PYNB] Kevin Patrick Murphy. Probabilistic Machine Learning: An Introduction. MIT Press. 2021

A comprehensive introduction to machine learning that uses probabilistic models and inference as a unifying approach. All the books are listed at http://mlbayes.ai

[PDF] Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016

The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning in particular. The online version of the book is now complete and will remain available online for free.

[PYNB] Chollet, Francois. Deep Learning with Python. Manning Publications, 2017

Deep Learning with Python introduces the field of deep learning using the Python language and the powerful Keras library. Written by Keras creator and Google AI researcher François Chollet, this book builds your understanding through intuitive explanations and practical examples.

Ng, Andrew. Machine Learning Yearning. 2018

AI is transforming numerous industries. Machine Learning Yearning, a free ebook from Andrew Ng, teaches you how to structure Machine Learning projects. This book is focused not on teaching you ML algorithms, but on how to make ML algorithms work.

[PDF] Stevens, Eli, Luca Antiga, and Thomas Viehmann. Deep Learning with PyTorch. Manning Publications, 2020

Deep Learning with PyTorch teaches you how to implement deep learning algorithms with Python and PyTorch. This book takes you into a fascinating case study: building an algorithm capable of detecting malignant lung tumors using CT scans. As the authors guide you through this real example, you'll discover just how effective and fun PyTorch can be. After a quick introduction to the deep learning landscape, you'll explore the use of pre-trained networks and start sharpening your skills on working with tensors. You'll find out how to represent the most common types of data with tensors and how to build and train neural networks from scratch on practical examples, focusing on images and sequences.

Howard, Jeremy, and Sylvain Gugger. FastAI's Jupyter notebooks. 2020

These notebooks cover an introduction to deep learning, fastai, and PyTorch.

[PDF] Rasmussen, CE and Williams, CKI. Gaussian Processes for Machine Learning. MIT press, 2006

Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade, and this book provides a long-needed systematic and unified treatment of theoretical and practical aspects of GPs in machine learning. The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics.

Courses

Data Flowr 🌻: a Bayesian deep learning course, by Marc Lelarge & Jill-Jênn Vie (X-ENS)

This site collects resources to learn Deep Learning in the form of Modules available through the sidebar on the left. As a student, you can walk through the modules at your own pace and interact with others thanks to the associated digital platforms. Then we hope you'll become a contributor by adding modules to this site!

Institut Polytechnique de Paris: Deep Learning, by Olivier Grisel & Charles Ollion [PyTorch]

The course covers the basics of Deep Learning, with a focus on applications.

FUN: Machine learning in Python with scikit-learn., by Olivier Grisel et al. (Inria) [Website]

Build predictive models with scikit-learn and gain a practical understanding of the strengths and limitations of machine learning! Hosted by France Université Numérique (FUN).

Unige 14x050 / EPFL EE-559: Deep Learning, by François Fleuret (University of Geneva)

This course is a thorough introduction to deep-learning, with examples in the PyTorch framework.

UC Berkeley: Full-Stack Deep Learning

There are many great courses to learn how to train deep neural networks. However, training the model is just one part of shipping a deep learning project. This course teaches full-stack production deep learning:

DS-GA 1008: Deep Learning (with PyTorch), by Yann LeCun & Alfredo Canziani (New York University) Github

This course concerns the latest techniques in deep learning and representation learning, focusing on supervised and unsupervised deep learning, embedding methods, metric learning, convolutional and recurrent nets, with applications to computer vision, natural language understanding, and speech recognition.

Google: TensorFlow, Keras and deep learning, without a PhD (and other TensorFlow courses)

Google Developers Codelabs provide a guided, tutorial, hands-on coding experience. Most codelabs will step you through the process of building a small application, or adding a new feature to an existing application. They cover a wide range of topics such as Android Wear, Google Compute Engine, Project Tango, and Google APIs on iOS.

Deep Learning with PyTorch: A 60 Minute Blitz

A tutorial about PyTorch, by Soumith Chintala (Facebook AI Research).

Google: Machine Learning Crash Course

A self-study guide for aspiring machine learning practitioners. Machine Learning Crash Course features a series of lessons with video lectures, real-world case studies, and hands-on practice exercises.

CS231n: Convolutional Neural Networks for Visual Recognition

Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This course is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. The final assignment will involve training a multi-million parameter convolutional neural network and applying it on the largest image classification dataset (ImageNet). We will focus on teaching how to set up the problem of image recognition, the learning algorithms (e.g. backpropagation), practical engineering tricks for training and fine-tuning the networks and guide the students through hands-on assignments and a final course project. Much of the background and materials of this course will be drawn from the ImageNet Challenge.

CS294-158-SP20: Deep Unsupervised Learning

Berkeley's course will cover two areas of deep learning in which labeled data is not required: Deep Generative Models and Self-supervised Learning. Recent advances in generative models have made it possible to realistically model high-dimensional raw data such as natural images, audio waveforms and text corpora. Strides in self-supervised learning have started to close the gap between supervised representation learning and unsupervised representation learning in terms of fine-tuning to unseen tasks. This course will cover the theoretical foundations of these topics as well as their newly enabled applications.

INFO8010: Deep Learning

Lectures by Gilles Louppe, researcher in AI and contributor to sci-kit learn, at ULiège.

INFO8006: Introduction to Artificial Intelligence

Lectures by Gilles Louppe, researcher in AI and contributor to sci-kit learn, at ULiège.

MIT 6.S191: Introduction to Deep Learning

MIT's introductory course on deep learning methods with applications to computer vision, natural language processing, biology, and more! Students will gain foundational knowledge of deep learning algorithms and get practical experience in building neural networks in TensorFlow.

CS224n: Natural Language Processing with Deep Learning

Natural language processing (NLP) is one of the most important technologies of the information age, and a crucial part of artificial intelligence. Applications of NLP are everywhere because people communicate almost everything in language: web search, advertising, emails, customer service, language translation, medical reports, etc. In recent years, Deep Learning approaches have obtained very high performance across many different NLP tasks, using single end-to-end neural models that do not require traditional, task-specific feature engineering. In this course, students will gain a thorough introduction to cutting-edge research in Deep Learning for NLP. Through lectures, assignments and a final project, students will learn the necessary skills to design, implement, and understand their own neural network models. This year, CS224n will be taught for the first time using PyTorch rather than TensorFlow (as in previous years).

CS228: Probabilistic Graphical Models

This course starts by introducing probabilistic graphical models from the very basics and concludes by explaining from first principles the variational auto-encoder, an important probabilistic model that is also one of the most influential recent results in deep learning.

Gabriel Peyré: Numerical Tours of Data Sciences

The Numerical Tours of Data Sciences, by Gabriel Peyré, gather Python experiments to explore modern mathematical data sciences. They cover data sciences in a broad sense, including imaging, machine learning, computer vision and computer graphics. It showcases application of numerical and mathematical methods such as convex optimization, PDEs, optimal transport, inverse problems, sparsity, etc. The tours are complemented by slides of courses detailing the theory and the algorithms.

Videos

Data Flowr 🌻 on Youtube

Videos about deep learning. Please visit https://www.dataflowr.com/ for more material.

CS231n: Convolutional Neural Networks for Visual Recognition. Stanford University School of Engineering, 2017

Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This lecture collection is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. From this lecture collection, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. Instructors: Fei-Fei Li: http://vision.stanford.edu/feifeili/ Justin Johnson: http://cs.stanford.edu/people/jcjohns/ Serena Yeung: http://ai.stanford.edu/~syyeung/

MIT 6.S191: Introduction to Deep Learning. MIT, 2020

MIT's introductory course on deep learning methods with applications to computer vision, natural language processing, biology, and more! Students will gain foundational knowledge of deep learning algorithms and get practical experience in building neural networks in TensorFlow.

Introduction to reinforcement learning. University College London, 2018

Watch the lectures from DeepMind research lead David Silver's course on reinforcement learning, taught at University College London. Access slides, assignments, exams, and more info about the course at: http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html To learn more about DeepMind's ongoing research with reinforcement learning, visit our blog: https://deepmind.com/blog/deep-reinforcement-learning/ For a powerful illustration of reinforcement learning in action, watch the AlphaGo documentary: https://www.alphagomovie.com/

Advanced Deep Learning & Reinforcement Learning. University College London, 2018

This course, taught originally at UCL and recorded for online access, has two interleaved parts that converge towards the end of the course. One part is on machine learning with deep neural networks, the other part is about prediction and control using reinforcement learning. The two strands come together when we discuss deep reinforcement learning, where deep neural networks are trained as function approximators in a reinforcement learning setting. The deep learning stream of the course will cover a short introduction to neural networks and supervised learning with TensorFlow, followed by lectures on convolutional neural networks, recurrent neural networks, end-to-end and energy-based learning, optimization methods, unsupervised learning as well as attention and memory. Possible applications areas to be discussed include object recognition and natural language processing. The reinforcement learning stream will cover Markov decision processes, planning by dynamic programming, model-free prediction and control, value function approximation, policy gradient methods, integration of learning and planning, and the exploration/exploitation dilemma. Possible applications to be discussed include learning to play classic board games as well as video games.

[slides] Ian Goodfellow. Adversarial Machine Learning. Association for Computing Machinery (ACM), 2018

Most machine learning algorithms involve optimizing a single set of parameters to decrease a single cost function. In adversarial machine learning, two or more "players" each adapt their own parameters to decrease their own cost, in competition with the other players. In some adversarial machine learning algorithms, the algorithm designer contrives this competition between two machine learning models in order to produce a beneficial side effect. For example, the generative adversarial networks framework involves a contrived conflict between a generator network and a discriminator network that results in the generator learning to produce realistic data samples. In other contexts, adversarial machine learning models a real conflict, for example, between spam detectors and spammers. In general, moving machine learning from optimization and a single cost to game theory and multiple costs has led to new insights in many application areas.

[slides] Ian Goodfellow. Generative Adversarial Networks. Neural Information Processing Systems (NIPS), 2016

Generative adversarial networks (GANs) are a recently introduced class of generative models, designed to produce realistic samples. This tutorial is intended to be accessible to an audience who has no experience with GANs, and should prepare the audience to make original research contributions applying GANs or improving the core GAN algorithms. GANs are universal approximators of probability distributions. Such models generally have an intractable log-likelihood gradient, and require approximations such as Markov chain Monte Carlo or variational lower bounds to make learning feasible. GANs avoid using either of these classes of approximations. The learning process consists of a game between two adversaries: a generator network that attempts to produce realistic samples, and a discriminator network that attempts to identify whether samples originated from the training data or from the generative model. At the Nash equilibrium of this game, the generator network reproduces the data distribution exactly, and the discriminator network cannot distinguish samples from the model from training data. Both networks can be trained using stochastic gradient descent with exact gradients computed by maximum likelihood.

External resources

DeepMind's AtHomeWithAI: Curated Resource List

A list of educational resources curated by DeepMind Scientists and Engineers for students interested in learning more about artifical intelligence, machine learning and other related topics.

[WEB] Stéphane Mallat's "Challenge de données"

We organize challenges of data sciences from data provided by public services, companies and laboratorie. This website is managed by the Data team of the École Normale Supérieure of Paris in partnership with the Collège de France. It is supported by the CFM chair and the PRAIRIE Institute.

Blogs

Distill

Distill is an academic journal in the area of Machine Learning. The distinguishing trait of a Distill article is outstanding communication and a dedication to human understanding. Distill articles often, but not always, use interactive media.

Francis Bach's blog posts

Francis Bach is a researcher at INRIA in the Computer Science department of Ecole Normale Supérieure, in Paris, France. He has been working on machine learning since 2000, with a focus on algorithmic and theoretical contributions, in particular in optimization.

Facebook

Facebook is an American multinational technology conglomerate based in Menlo Park, California.

OpenAI

OpenAI is an AI research and deployment company based in San Francisco, California.

Gwern

This is the website of Gwern Branwen. I write about psychology, statistics, and technology; I am best known for work on the darknet markets & Bitcoin, blinded self-experiments & Quantified Self analyses, dual n-back & spaced repetition, and modafinil.

Python

Greg Wilson's Software Design by Example

A tool-based introduction with Python.

Henry Schreiner's Level Up Your Python

A course in intermediate Python for a beginner ready to move up.

woctezuma/deep-learning-resources

Deep Learning Resources

Books

Courses

Videos

External resources

Blogs

Python