/cs231n-study-schedule

cs231n learning notes

Primary LanguageJupyter Notebook

CS231n: Convolutional Neural Networks for Visual Recognition (Spring 2017)

cs231n learning notes

Website: Convolutional Neural Networks for Visual Recognition (Spring 2017)

Video: CS231n Spring 2017

Course Syllabus

Lecture 1: Course Introduction [done!!!]

slides [done!!!]

  • Computer vision overview
  • Historical context
  • Course logistics

video [done!!!]

Lecture 2: Image Classification [done!!!]

slides [done!!!]

  • The data-driven approach
  • K-nearest neighbor
  • Linear classification I

video [done!!!]

  • Intro to Image Classification, data-driven approach, pipeline

  • Nearest Neighbor Classifier

    • k-Nearest Neighbor
  • Validation sets, Cross-validation, hyperparameter tuning

  • Pros/Cons of Nearest Neighbor

    • accelerate the nearest neighbor lookup in a dataset (e.g. FLANN)
    • a visualization technique called t-SNE
  • Summary

  • Summary: Applying kNN in practice

  • Further Reading

    Here are some (optional) links you may find interesting for further reading:

  • Intro to Linear classification

  • Linear score function

  • Interpreting a linear classifier

  • Loss function

    • Multiclass SVM
      • For example, it turns out that including the L2 penalty leads to the appealing max margin property in SVMs (See CS229 lecture notes for full details if you are interested).
    • Softmax classifier
    • SVM vs Softmax
  • Interactive Web Demo of Linear Classification

  • Summary

  • Further Reading

    These readings are optional and contain pointers of interest.

Lecture3 : Loss Functions and Optimization [done!!!]

slides [done!!!]

  • Linear classification II
  • Higher-level representations, image features
  • Optimization, stochastic gradient descent

video [done!!!]

same to Lecture2: linear classification notes

  • Introduction

  • Visualizing the loss function

  • Optimization

    • Strategy #1: Random Search
    • Strategy #2: Random Local Search
    • Strategy #3: Following the gradient
  • Computing the gradient

    • Numerically with finite differences
    • Analytically with calculus
  • Gradient descent

  • Summary

Lecture4: Introduction to Neural Networks [done!!!]

slides [done!!!]

  • Backpropagation
  • Multi-layer Perceptrons
  • The neural viewpoint

video [done!!!]

backprop notes [done!!!]

derivatives notes (optional) [done!!!]

Efficient BackProp (optional) [done!!!]

Related (optional) [done!!!]

Lecture5: Convolutional Neural Networks [done!!!]

slides [done!!!]

  • History

  • Convolution and pooling

  • ConvNets outside vision

video [done!!!]

ConvNet notes [done!!!]

Lecture 6: Training Neural Networks, part I [done!!!]

slides [done!!!]

  • Activation functions, initialization, dropout, batch normalization

video [done!!!]

Neural Nets notes 3 [done!!!]

tips/tricks(optional) [done!!!]

Lecture 7: Training Neural Networks, part II [done!!!]

slides [done!!!]

video [done !!!]

Neural Nets notes 3 (same as the Lecture 6) [done!!!]

Lecture 8: Deep Learning Software [done!!!]

slides [done!!!]

20170926_01

video [done!!!]

Lecture 9: CNN Architectures [done!!! papers need to read]

slides [done!!!]

Architectures Cases

Comparison

Other architectures

20170927_01

video [done!!!]

Lecture 10 : Recurrent Neural Networks [done!!! papers need to read]

slides [done!!!]

20170930_01

video [done!!!]

Related materials

Assignment #2 [done!!!]

Q3: Dropou [done!!!]

Lecture 11 : Detection and Segmentation [done!!! papers need to read]

slides [done!!!]

Semantic Segmentation Idea: Sliding Window

  • Farabet et al, “Learning Hierarchical Features for Scene Labeling,” TPAMI 2013
  • Pinheiro and Collobert, “Recurrent Convolutional Neural Networks for Scene Labeling”, ICML 2014

!!! Problem: Very inefficient! Not reusing shared features between overlapping patches

Semantic Segmentation Idea: Fully Convolutional

Design network as a bunch of convolutional layers, with downsampling and upsampling inside the network!

20171016_01

  • Long, Shelhamer, and Darrell, “Fully Convolutional Networks for Semantic Segmentation”, CVPR 2015
  • Noh et al, “Learning Deconvolution Network for Semantic Segmentation”, ICCV 2015

Classification + Localization : Multitask Loss

20171016_02

20171016_03

  • Toshev and Szegedy, “DeepPose: Human Pose Estimation via Deep Neural Networks”, CVPR 2014

Treat localization as a regression problem!

Object Detection as Classification: Sliding Window

20171016_04

Problem: Need to apply CNN to huge number of locations and scales, very computationally expensive!

R-CNN: Region Proposals

20171016_05

  • Girshick et al, “Rich feature hierarchies for accurate object detection and semantic segmentation”, CVPR 2014.

20171016_06

Fast R-CNN

20171016_07

20171016_08

20171016_09

  • Girshick, “Fast R-CNN”, ICCV 2015.

Faster R-CNN

20171016_10

  • Ren et al, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”, NIPS 2015

Detection without Proposals: YOLO / SSD

20171016_11

  • Redmon et al, “You Only Look Once: Unified, Real-Time Object Detection”, CVPR 2016
  • Liu et al, “SSD: Single-Shot MultiBox Detector”, ECCV 2016

Object Detection: Lots of variables ...

20171016_12

  • Huang et al, “Speed/accuracy trade-offs for modern convolutional object detectors”, CVPR 2017

Aside: Object Detection + Captioning = Dense Captioning

Mask R-CNN !!!

20171016_13

  • He et al, “Mask R-CNN”, arXiv 2017

Video [done!!!]

Lecture 12: Visualizing and Understanding [done!!! papers need to read]

slides [done!!!]

DeepDream

neural-style

fast-neural-style

20171026_01

  • First Layer: Visualize Filters

Krizhevsky, “One weird trick for parallelizing convolutional neural networks”, arXiv 2014 He et al, “Deep Residual Learning for Image Recognition”, CVPR 2016 Huang et al, “Densely Connected Convolutional Networks”, CVPR 2017

  • Last Layer: Nearest Neighbors、 Dimensionality Reduction

Krizhevsky et al, “ImageNet Classification with Deep Convolutional Neural Networks”, NIPS 2012.

Van der Maaten and Hinton, “Visualizing Data using t-SNE”, JMLR 2008

  • Visualizing Activations

Yosinski et al, “Understanding Neural Networks Through Deep Visualization”, ICML DL Workshop 2014.

  • Occlusion Experiments

Zeiler and Fergus, “Visualizing and Understanding Convolutional Networks”, ECCV 2014

  • Saliency Maps

Simonyan, Vedaldi, and Zisserman, “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps”, ICLR Workshop 2014.

  • Visualizing CNN features: Gradient Ascent

Simonyan, Vedaldi, and Zisserman, “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps”, ICLR Workshop 2014.

Yosinski et al, “Understanding Neural Networks Through Deep Visualization”, ICML DL Workshop 2014.

Nguyen et al, “Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks”, ICML Visualization for Deep Learning Workshop 2016.

  • Fooling Images / Adversarial Examples
  • (1) Start from an arbitrary image
  • (2) Pick an arbitrary class
  • (3) Modify the image to maximize the class
  • (4) Repeat until network is fooled
  • DeepDream: Amplify existing features

Mordvintsev, Olah, and Tyka, “Inceptionism: Going Deeper into Neural Networks”, Google Research Blog.

  • Feature Inversion

Mahendran and Vedaldi, “Understanding Deep Image Representations by Inverting Them”, CVPR 2015

Johnson, Alahi, and Fei-Fei, “Perceptual Losses for Real-Time Style Transfer and Super-Resolution”, ECCV 2016. Copyright Springer, 2016.

  • Neural Texture Synthesis

Gatys, Ecker, and Bethge, “Texture Synthesis Using Convolutional Neural Networks”, NIPS 2015

Johnson, Alahi, and Fei-Fei, “Perceptual Losses for Real-Time Style Transfer and Super-Resolution”, ECCV 2016. Copyright Springer, 2016.

  • Neural Style Transfer

Johnson, Alahi, and Fei-Fei, “Perceptual Losses for Real-Time Style Transfer and Super-Resolution”, ECCV 2016.

Gatys, Ecker, and Bethge, “Texture Synthesis Using Convolutional Neural Networks”, NIPS 2015

Gatys, Ecker, and Bethge, “Image style transfer using convolutional neural networks”, CVPR 2016 Figure adapted from Johnson, Alahi, and Fei-Fei, “Perceptual Losses for Real-Time Style Transfer and Super-Resolution”, ECCV 2016.

Ulyanov et al, “Texture Networks: Feed-forward Synthesis of Textures and Stylized Images”, ICML 2016

Dumoulin, Shlens, and Kudlur, “A Learned Representation for Artistic Style”, ICLR 2017

video [done!!!]

Lecture 13: Generative Models [done!!! papers need to read]

slides [done!!!]

Overview

  • Unsupervised Learning

  • Generative Models

    ○ PixelRNN and PixelCNN

    ○ Variational Autoencoders (VAE)

    ○ Generative Adversarial Networks (GAN)

Supervised vs Unsupervised Learning

20171026_02

20171026_03

20171026_04

20171026_05

20171026_06

PixelRNN and PixelCNN

20171026_07

  • PixelRNN

Pixel Recurrent Neural Networks

20171026_08

  • PixelCNN

Conditional Image Generation with PixelCNN Decoders

20171026_09

20171026_10

Variational Autoencoders (VAE)

20171026_11

20171026_12

20171026_13

20171026_14

20171026_15

20171026_16

Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014

20171026_17

Generative Adversarial Networks

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

20171026_18

20171026_19

20171026_20

20171026_21

20171026_22

  • Generative Adversarial Nets: Convolutional Architectures

Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016

20171026_23

20171026_24

Recap

20171026_25

video [done!!!]

Lecture 14: Deep Reinforcement Learning [done!!!]

slides [done!!!]

video [done!!!]

Guest Lecture Song Han : Efficient Methods and Hardware for Deep Learning [done!!!]

slides [done!!!]

video [done!!!]

Assignment #3

Guest Lecture Ian Goodfellow : Adversarial Examples and Adversarial Training [done!!!]

slides [done!!!]

video [done!!!]

Final course project