Topics course Mathematics of Deep Learning, NYU, Spring 19. CSCI-GA 3033.
-
Mondays from 7.10pm-9pm. CIWW 102
-
Tutoring Session with Parallel Curricula (optional): Fridays 11am-12:15pm CIWW 101.
-
Piazza: sign-up here
-
Office Hours: Tuesdays 9:30am-11:00am
Lecture Instructor: Joan Bruna (bruna@cims.nyu.edu)
Tutor (Parallel Curricula): Luca Venturi (lv800@nyu.edu)
This Graduate-level topics course aims at offering a glimpse into the emerging mathematical questions around Deep Learning. In particular, we will focus on the different geometrical aspects surounding these models, from input geometric stability priors to the geometry of optimization, generalisation and learning. We will cover both the background and the current open problems.
Besides the lectures, we will also run a parallel curricula (optional), following the Depth First Learning methodology. We will start with an inverse curriculum on the Neural ODE paper by Chen et al.
-
Introduction: the Curse of Dimensionality
-
Part I: Geometry of Data
- Euclidean Geometry: transportation metrics, CNNs , scattering.
- Non-Euclidean Geometry: Graph Neural Networks.
- Unsupervised Learning under Geometric Priors (Implicit vs explicit models, microcanonical, transportation metrics).
- Applications and Open Problems: adversarial examples, graph inference, inverse problems.
-
Part II: Geometry of Optimization and Generalization
- Stochastic Optimization (Robbins & Munro, Convergence of SGD)
- Stochastic Differential Equations (Fokker-Plank, Gradient Flow, Langevin Dynamics, links with SGD; open problems)
- Dynamics of Neural Network Optimization (Mean Field Models using Optimal Transport, Kernel Methods)
- Landscape of Deep Learning Optimization (Tensor/Matrix factorization, Deep Nets; open problems).
- Generalization in Deep Learning.
-
Part III (time permitting): Open qustions on Reinforcement Learning
Multivariate Calculus, Linear Algebra, Probability and Statistics at solid undergraduate level.
Notions of Harmonic Analysis, Differential Geometry and Stochastic Calculus are nice-to-have, but not essential.
The course will be graded with a final project -- consisting in an in-depth survey of a topic related to the syllabus, plus a participation grade. The detailed abstract of the project will be graded at the mid-term.
Final Project is due May 1st by email to the instructors
Week | Lecture Date | Topic | References |
---|---|---|---|
1 | 1/23 | Guest Lecture: Arthur Szlam (Facebook) | References |
2 | 1/30 | Lec2 Euclidean Geometric Stability. Slides | References |
3 | 2/6 | Guest Lecture: Leon Bottou (Facebook/NYU) Slides | References |
4 | 2/13 | Lec3 Scattering Transforms and CNNs Slides | References |
5 | 2/20 | Lec4 Non-Euclidean Geometric Stability. Gromov-Hausdorff distances. Graph Neural Nets Slides | References |
6 | 2/27 | Lec5 Graph Neural Network Applications Slides | References |
7 | 3/6 | Lec6 Unsupervised Learning under Geometric Priors. Implicit vs Explicit models. Optimal Transport models. Microcanonical Models. Open Problems Slides | References |
8 | 3/13 | Spring Break | References |
9 | 3/20 | Lec7 Discrete vs Continuous Time Optimization. The Convex Case. Slides | References |
10 | 3/27 | Lec8 Discrete vs Continuous Time Optimization. Stochastic and Non-convex case Slides | References |
11 | 4/3 | Lec9 Gradient Descent on Non-convex Optimization. Slides | References |
12 | 4/10 | Lec10 Gradient Descent on Non-convex Optimization. Escaping Saddle Points efficiently. Slides | References |
13 | 4/17 | Lec11 Landscape of Deep Learning Optimization. Spin Glasses, Kac-Rice, RKHS, Topology. Slides | References |
14 | 4/24 | Lec12 Guest Lecture: Behnam Neyshabur (IAS/NYU): Generalization in Deep Learning Slides | References |
15 | 5/1 | Lec13 Stability. Open Problems. | References |
NeuralODE: Living document
- Class 1: Numerical solution of ODEs I
- Motivation: ODEs are used to mathematically model a number of natural processes and phenomena. The study of their numerical simulations is one of the main topics in numerical analysis and of fundamental importance in applied sciences.
- Required Reading:
- Sections 12.1-4 from An Introduction to Numerical Analysis (‘INA’)
- Sections 11.1-3 from Numerical Mathematics (‘NM’)