Topics Course on Deep Learning for Spring 2016
by Joan Bruna, UC Berkeley, Statistics Department
##Syllabus
- Invariance, stability.
- Variability models (deformation model, stochastic model).
- Scattering
- Extensions
- Group Formalism
- Supervised Learning: classification.
- Properties of CNN representations: invertibility, stability, invariance.
- covariance/invariance: capsules and related models.
- Connections with other models: dictionary learning, LISTA, Random Forests.
- Other tasks: localization, regression.
- Embeddings (DrLim), inverse problems
- Extensions to non-euclidean domains.
- Dynamical systems: RNNs and optimal control.
- Guest Lecture: Wojciech Zaremba (OpenAI)
- Autoencoders (standard, denoising, contractive, etc.)
- Variational Autoencoders
- Adversarial Generative Networks
- Maximum Entropy Distributions
- Open Problems
- Guest Lecture: Soumith Chintala (Facebook AI Research)
- Non-convex optimization theory for deep networks
- Stochastic Optimization
- Attention and Memory Models
- Guest Lecture: Yann Dauphin (Facebook AI Research)
-
Lec1 Jan 19: Intro and Logistics
-
Lec2 Jan 21: Representations for Recognition : stability, variability. Kernel approaches / Feature extraction. Properties.
- Elements of Statistical Learning, chapt. 12, Hastie, Tibshirani, Friedman.
- Understanding Deep Convolutional Networks, S. Mallat.
-
Lec3 Jan 26: Groups, Invariants and Filters.
-
Lec4 Jan 28: Scattering Convolutional Networks.
further reading
-
Lec5 Feb 2: Further Scattering: Properties and Extensions.
-
Lec6 Feb 4: Convolutional Neural Networks: Geometry and first Properties.
- Deep Learning Y. LeCun, Bengio & Hinton.
- Understanding Deep Convolutional Networks, S. Mallat.
-
Lec7 Feb 9: Properties of learnt CNN representations: Covariance and Invariance, redundancy, invertibility
- Deep Neural Networks with Random Gaussian Weights: A universal Classification Strategy?, R. Giryes, G. Sapiro, A. Bronstein.
- Intriguing Properties of Neural Networks C. Szegedy et al.
- Geodesics of Learnt Representations O. Henaff & E. Simoncelli.
- Inverting Visual Representations with Convolutional Networks, A. Dosovitskiy, T. Brox.
- Visualizing and Understanding Convolutional Networks M. Zeiler, R. Fergus.
-
Lec8 Feb 11: Connections with other models (DL, Lista, Random Forests, CART)
-
Proximal Splitting Methods in Signal Processing Combettes & Pesquet.
-
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems Beck & Teboulle
-
Learning Fast Approximations of Sparse Coding K. Gregor & Y. LeCun
-
Task Driven Dictionary Learning J. Mairal, F. Bach, J. Ponce
-
Exploiting Generative Models in Discriminative Classifiers T. Jaakkola & D. Haussler
-
Improving the Fisher Kernel for Large-Scale Image Classification F. Perronnin et al.
-
NetVLAD R. Arandjelovic et al.
-
Lec9 Feb 16: Other high level tasks: localization, regression, embedding, inverse problems.
-
Object Detection with Discriminatively Trained Deformable Parts Model Felzenswalb, Girshick, McAllester and Ramanan, PAMI'10
-
Deformable Parts Models are Convolutional Neural Networks, Girshick, Iandola, Darrel and Malik, CVPR'15.
-
Rich Feature Hierarchies for accurate object detection and semantic segmentation Girshick, Donahue, Darrel and Malik, PAMI'14.
-
Graphical Models, message-passing algorithms and convex optimization M. Wainwright.
-
Conditional Random Fields as Recurrent Neural Networks Zheng et al, ICCV'15
-
Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation Tompson, Jain, LeCun and Bregler, NIPS'14.
-
Lec10 Feb 18: Extensions to non-Euclidean domain. Representations of stationary processes. Properties.
-
Dimensionality Reduction by Learning an Invariant Mapping Hadsell, Chopra, LeCun,'06.
-
Deep Metric Learning via Lifted Structured Feature Embedding Oh Song, Xiang, Jegelka, Savarese,'15.
-
Spectral Networks and Locally Connected Networks on Graphs Bruna, Szlam, Zaremba, LeCun,'14.
-
Spatial Transformer Networks Jaderberg, Simonyan, Zisserman, Kavukcuoglu,'15.
-
Intermittent Process Analysis with Scattering Moments Bruna, Mallat, Bacry, Muzy,'14.
-
Lec11 Feb 23: Guest Lecture ( W. Zaremba, OpenAI ) Discrete Neural Turing Machines.
-
Lec12 Feb 25: Representations of Stationary Processes (contd). Sequential Data: Recurrent Neural Networks.
- Intermittent Process Analysis with Scattering Moments J.B., Mallat, Bacry and Muzy, Annals of Statistics,'13.
- A mathematical motivation for complex-valued convolutional networks Tygert et al., Neural Computation'16.
- Texture Synthesis Using Convolutional Neural Networks Gatys, Ecker, Betghe, NIPS'15.
- A Neural Algorithm of Artistic Style, Gatys, Ecker, Betghe, '15.
- Time Series Analysis and its Applications Shumway, Stoffer, Chapter 6.
- Deep Learning Goodfellow, Bengio, Courville,'16. Chapter 10.
-
Lec13 Mar 1: Recurrent Neural Networks (contd). Long Short Term Memory. Applications.
-
Deep Learning Goodfellow, Bengio, Courville,'16. Chapter 10.
-
Generating Sequences with Recurrent Neural Networks A. Graves.
-
The Unreasonable Effectiveness of Recurrent Neural Networks A. Karpathy
-
The Unreasonable effectiveness of Character-level Language Models Y. Goldberg
-
Lec14 Mar 3: Unsupervised Learning: Curse of dimensionality, Density estimation. Graphical Models, Latent Variable models.
- Describing Multimedia Content Using Attention-based Encoder-Decoder Networks K. Cho, A. Courville, Y. Bengio
- Graphical Models, Exponential Families and Variational Inference M. Wainwright, M. Jordan.
-
Lec15 Mar 8: Autoencoders. Variational Inference. Variational Autoencoders.
- Graphical Models, Exponential Families and Variational Inference, chapter 3 M. Wainwright, M. Jordan.
- Variational Inference with Stochastic Search J.Paisley, D. Blei, M.Jordan.
- Stochastic Variational Inference M. Hoffman, D. Blei, Wang, Paisley.
- Auto-Encoding Variational Bayes, Kingma & Welling.
- Stochastic Backpropagation and variational inference in deep latent gaussian models D. Rezende, S. Mohamed, D. Wierstra.
-
Lec16 Mar 10: Variational Autoencoders (contd). Normalizing Flows. Adversarial Generative Networks.
- Semi-supervised learning with Deep generative models Kingma, Rezende, Mohamed, Welling.
- Importance Weighted Autoencoders Burda, Grosse, Salakhutdinov.
- Variational Inference with Normalizing Flows Rezende, Mohamed.
- Unsupervised Learning using Nonequilibrium Thermodynamics Sohl-Dickstein et al.
- Generative Adversarial Networks, Goodfellow et al.
-
Lec17 Mar 29: Adversarial Generative Networks (contd).
- Generative Adversarial Networks, Goodfellow et al.
- Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks Denton, Chintala, Szlam, Fergus.
- Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks Radford, Metz, Chintala.
-
Lec18 Mar 31: Maximum Entropy Distributions. Self-supervised models (analogies, video prediction, text, word2vec).
- Graphical Models, Exponential Families and Variational Inference, chapter 3 M. Wainwright, M. Jordan.
- An Introduction to MCMC for Machine Learning Andrieu, de Freitas, Doucet, Jordan.
- Stochastic relaxation, Gibbs distributions and the Bayesian Restoration of Images Geman & Geman.
- Distributed Representations of Words and Phrases and their compositionality Mikolov et al.
- word2vec Explained: deriving Mikolov et al's negative-sampling embedding method Goldberg & Levy.
-
Lec19 Apr 5: Self-supervised models (contd). Non-convex Optimization. Stochastic Optimization.
- Pixel Recurrent Neural Networks A. van den Oord, N. Kalchbrenner, K. Kavukcuoglu.
- The tradeoffs of Large Scale Learning Bottou, Bousquet.
- Introduction to Statistical Learning Theory Bousquet, Boucheron, Lugosi.
-
Lec20 Apr 7: Guest Lecture (S. Chintala, Facebook AI Research)
-
Lec21 Apr 12: Accelerated Gradient Descent, Regularization, Dropout.
- Convex Optimization: Algorithms and Complexity S. Bubeck
- Optimization, Simons Big Data Boot Camp B. Recht
- The Zen of Gradient Descent M. Hardt.
- Train Faster, Generalize Better: Stability of Stochastic Gradient Descent M. Hardt, B. Recht, Y. Singer.
- Dropout: a simple way to prevent neural networks from Overfitting Srivastava, Hinton et al.
-
Lec22 Apr 14: Dropout (contd). Batch Normalization, Tensor Decompositions.
- Dropout Training as Adaptive Regularization Wager, Wang, Liang.
- Batch Normalization: accelerating Deep Network Training by Reducing internal covariate shift Ioffe, Szegedy.
- Global Optimality in Tensor Factorization, Deep Learning and Beyond Haefflele, Vidal.
- On the expressive power of Deep Learning: a tensor analysis Cohen, Sharir, Shashua.
- Beating the Perils of non-convexity: Guaranteed Training of Neural Networks using Tensor methods Janzamin, Sedghi, Anandkumar.
-
Lec23 Apr 19: Guest Lecture (Y. Dauphin, Facebook AI Research)
-
Lec24-25: Oral Presentations