stat212b

Topics Course on Deep Learning for Spring 2016

by Joan Bruna, UC Berkeley, Statistics Department

##Syllabus

1st part: Convolutional Neural Networks

Invariance, stability.
Variability models (deformation model, stochastic model).
Scattering
Extensions
Group Formalism
Supervised Learning: classification.
Properties of CNN representations: invertibility, stability, invariance.
covariance/invariance: capsules and related models.
Connections with other models: dictionary learning, LISTA, Random Forests.
Other tasks: localization, regression.
Embeddings (DrLim), inverse problems
Extensions to non-euclidean domains.
Dynamical systems: RNNs and optimal control.
Guest Lecture: Wojciech Zaremba (OpenAI)

2nd part: Deep Unsupervised Learning

Autoencoders (standard, denoising, contractive, etc.)
Variational Autoencoders
Adversarial Generative Networks
Maximum Entropy Distributions
Open Problems
Guest Lecture: Soumith Chintala (Facebook AI Research)

3rd part: Miscellaneous Topics

Non-convex optimization theory for deep networks
Stochastic Optimization
Attention and Memory Models
Guest Lecture: Yann Dauphin (Facebook AI Research)

Schedule

Lec1 Jan 19: Intro and Logistics
Lec2 Jan 21: Representations for Recognition : stability, variability. Kernel approaches / Feature extraction. Properties.
- Elements of Statistical Learning, chapt. 12, Hastie, Tibshirani, Friedman.
- Understanding Deep Convolutional Networks, S. Mallat.
Lec3 Jan 26: Groups, Invariants and Filters.
- Learning Stable Group Invariant Representations with Convolutional Networks
- Understanding Deep Convolutional Networks, S. Mallat.
- A Wavelet Tour of Signal Processing, chapt 2-5,7, S. Mallat.
Lec4 Jan 28: Scattering Convolutional Networks.
- Invariant Scattering Convolutional Networks
further reading
- Group Invariant Scattering, S. Mallat
- Scattering Representations for Recognition
Lec5 Feb 2: Further Scattering: Properties and Extensions.
- Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination, Sifre & Mallat.
Lec6 Feb 4: Convolutional Neural Networks: Geometry and first Properties.
- Deep Learning Y. LeCun, Bengio & Hinton.
- Understanding Deep Convolutional Networks, S. Mallat.
Lec7 Feb 9: Properties of learnt CNN representations: Covariance and Invariance, redundancy, invertibility
- Deep Neural Networks with Random Gaussian Weights: A universal Classification Strategy?, R. Giryes, G. Sapiro, A. Bronstein.
- Intriguing Properties of Neural Networks C. Szegedy et al.
- Geodesics of Learnt Representations O. Henaff & E. Simoncelli.
- Inverting Visual Representations with Convolutional Networks, A. Dosovitskiy, T. Brox.
- Visualizing and Understanding Convolutional Networks M. Zeiler, R. Fergus.
Lec8 Feb 11: Connections with other models (DL, Lista, Random Forests, CART)
Proximal Splitting Methods in Signal Processing Combettes & Pesquet.
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems Beck & Teboulle
Learning Fast Approximations of Sparse Coding K. Gregor & Y. LeCun
Task Driven Dictionary Learning J. Mairal, F. Bach, J. Ponce
Exploiting Generative Models in Discriminative Classifiers T. Jaakkola & D. Haussler
Improving the Fisher Kernel for Large-Scale Image Classification F. Perronnin et al.
NetVLAD R. Arandjelovic et al.
Lec9 Feb 16: Other high level tasks: localization, regression, embedding, inverse problems.
Object Detection with Discriminatively Trained Deformable Parts Model Felzenswalb, Girshick, McAllester and Ramanan, PAMI'10
Deformable Parts Models are Convolutional Neural Networks, Girshick, Iandola, Darrel and Malik, CVPR'15.
Rich Feature Hierarchies for accurate object detection and semantic segmentation Girshick, Donahue, Darrel and Malik, PAMI'14.
Graphical Models, message-passing algorithms and convex optimization M. Wainwright.
Conditional Random Fields as Recurrent Neural Networks Zheng et al, ICCV'15
Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation Tompson, Jain, LeCun and Bregler, NIPS'14.
Lec10 Feb 18: Extensions to non-Euclidean domain. Representations of stationary processes. Properties.
Dimensionality Reduction by Learning an Invariant Mapping Hadsell, Chopra, LeCun,'06.
Deep Metric Learning via Lifted Structured Feature Embedding Oh Song, Xiang, Jegelka, Savarese,'15.
Spectral Networks and Locally Connected Networks on Graphs Bruna, Szlam, Zaremba, LeCun,'14.
Spatial Transformer Networks Jaderberg, Simonyan, Zisserman, Kavukcuoglu,'15.
Intermittent Process Analysis with Scattering Moments Bruna, Mallat, Bacry, Muzy,'14.
Lec11 Feb 23: Guest Lecture ( W. Zaremba, OpenAI ) Discrete Neural Turing Machines.
Lec12 Feb 25: Representations of Stationary Processes (contd). Sequential Data: Recurrent Neural Networks.
- Intermittent Process Analysis with Scattering Moments J.B., Mallat, Bacry and Muzy, Annals of Statistics,'13.
- A mathematical motivation for complex-valued convolutional networks Tygert et al., Neural Computation'16.
- Texture Synthesis Using Convolutional Neural Networks Gatys, Ecker, Betghe, NIPS'15.
- A Neural Algorithm of Artistic Style, Gatys, Ecker, Betghe, '15.
- Time Series Analysis and its Applications Shumway, Stoffer, Chapter 6.
- Deep Learning Goodfellow, Bengio, Courville,'16. Chapter 10.
Lec13 Mar 1: Recurrent Neural Networks (contd). Long Short Term Memory. Applications.
Deep Learning Goodfellow, Bengio, Courville,'16. Chapter 10.
Generating Sequences with Recurrent Neural Networks A. Graves.
The Unreasonable Effectiveness of Recurrent Neural Networks A. Karpathy
The Unreasonable effectiveness of Character-level Language Models Y. Goldberg
Lec14 Mar 3: Unsupervised Learning: Curse of dimensionality, Density estimation. Graphical Models, Latent Variable models.
- Describing Multimedia Content Using Attention-based Encoder-Decoder Networks K. Cho, A. Courville, Y. Bengio
- Graphical Models, Exponential Families and Variational Inference M. Wainwright, M. Jordan.
Lec15 Mar 8: Autoencoders. Variational Inference. Variational Autoencoders.
- Graphical Models, Exponential Families and Variational Inference, chapter 3 M. Wainwright, M. Jordan.
- Variational Inference with Stochastic Search J.Paisley, D. Blei, M.Jordan.
- Stochastic Variational Inference M. Hoffman, D. Blei, Wang, Paisley.
- Auto-Encoding Variational Bayes, Kingma & Welling.
- Stochastic Backpropagation and variational inference in deep latent gaussian models D. Rezende, S. Mohamed, D. Wierstra.
Lec16 Mar 10: Variational Autoencoders (contd). Normalizing Flows. Adversarial Generative Networks.
- Semi-supervised learning with Deep generative models Kingma, Rezende, Mohamed, Welling.
- Importance Weighted Autoencoders Burda, Grosse, Salakhutdinov.
- Variational Inference with Normalizing Flows Rezende, Mohamed.
- Unsupervised Learning using Nonequilibrium Thermodynamics Sohl-Dickstein et al.
- Generative Adversarial Networks, Goodfellow et al.
Lec17 Mar 29: Adversarial Generative Networks (contd).
- Generative Adversarial Networks, Goodfellow et al.
- Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks Denton, Chintala, Szlam, Fergus.
- Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks Radford, Metz, Chintala.
Lec18 Mar 31: Maximum Entropy Distributions. Self-supervised models (analogies, video prediction, text, word2vec).
- Graphical Models, Exponential Families and Variational Inference, chapter 3 M. Wainwright, M. Jordan.
- An Introduction to MCMC for Machine Learning Andrieu, de Freitas, Doucet, Jordan.
- Stochastic relaxation, Gibbs distributions and the Bayesian Restoration of Images Geman & Geman.
- Distributed Representations of Words and Phrases and their compositionality Mikolov et al.
- word2vec Explained: deriving Mikolov et al's negative-sampling embedding method Goldberg & Levy.
Lec19 Apr 5: Self-supervised models (contd). Non-convex Optimization. Stochastic Optimization.
- Pixel Recurrent Neural Networks A. van den Oord, N. Kalchbrenner, K. Kavukcuoglu.
- The tradeoffs of Large Scale Learning Bottou, Bousquet.
- Introduction to Statistical Learning Theory Bousquet, Boucheron, Lugosi.
Lec20 Apr 7: Guest Lecture (S. Chintala, Facebook AI Research)
Lec21 Apr 12: Accelerated Gradient Descent, Regularization, Dropout.
- Convex Optimization: Algorithms and Complexity S. Bubeck
- Optimization, Simons Big Data Boot Camp B. Recht
- The Zen of Gradient Descent M. Hardt.
- Train Faster, Generalize Better: Stability of Stochastic Gradient Descent M. Hardt, B. Recht, Y. Singer.
- Dropout: a simple way to prevent neural networks from Overfitting Srivastava, Hinton et al.
Lec22 Apr 14: Dropout (contd). Batch Normalization, Tensor Decompositions.
- Dropout Training as Adaptive Regularization Wager, Wang, Liang.
- Batch Normalization: accelerating Deep Network Training by Reducing internal covariate shift Ioffe, Szegedy.
- Global Optimality in Tensor Factorization, Deep Learning and Beyond Haefflele, Vidal.
- On the expressive power of Deep Learning: a tensor analysis Cohen, Sharir, Shashua.
- Beating the Perils of non-convexity: Guaranteed Training of Neural Networks using Tensor methods Janzamin, Sedghi, Anandkumar.
Lec23 Apr 19: Guest Lecture (Y. Dauphin, Facebook AI Research)
Lec24-25: Oral Presentations

joanbruna/stat212b

stat212b

1st part: Convolutional Neural Networks

2nd part: Deep Unsupervised Learning

3rd part: Miscellaneous Topics

Schedule