This course is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for visual recognition systems, particularly image classification. During the course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision.
- Intro to Computer Vision, historical context.
- Image classification and the data-driven approach. K-nearest neighbors. Linear classification I.
- Linear classification II. Higher-level representations, image features. Optimization, stochastic gradient descent.
- Backpropagation. Introduction to neural networks.
- Training Neural Networks Part 1: activation functions, weight initialization, gradient flow, batch normalization babysitting the learning process, hyperparameter optimization.
- Training Neural Networks Part 2: parameter updates, ensembles, dropout. Convolutional Neural Networks: intro.
- Convolutional Neural Networks: architectures, convolution / pooling layers. Case study of ImageNet challenge winning ConvNets.
- ConvNets for spatial localization. Object detection.
- Understanding and visualizing Convolutional Neural Networks. Backprop into image: Visualizations, deep dream, artistic style transfer. Adversarial fooling examples.
- Recurrent Neural Networks (RNN), Long Short Term Memory (LSTM). RNN language models. Image captioning.
- Training ConvNets in practice. Data augmentation, transfer learning. Distributed training, CPU/GPU bottlenecks. Efficient convolutions.
- Overview of Caffe/Torch/Theano/TensorFlow.
- Segmentation. Soft attention models. Spatial transformer networks.
- ConvNets for videos. Unsupervised learning.