/Spring2019

The Computer Vision Course at IITB (CS763), Spring 2019

Computer Vision (CS 763) - Spring 2019

Course Information

  • Instructor: Arjun Jain
  • Office: 216, CSE New Building
  • Email: ajain@cse DOT iitb DOT ac DOT in
  • Teaching Assistants: Rishabh Dabral, Safeer Afaque
  • Class Room: SIC201
  • Instructor Office Hours (in room 216 CSE New Building): TBD

Please note that CS663 is a hard prerequisite for this course.

News and Announcements

  • [8/01/19] Monday class to be moved to 7pm slot to accommodate 3rd year students
  • [14/01/19] The classroom has been moved to SIC201 (slots 13A and 15A) due to overflow in CC105
  • [17/01/19] Assignment 1 has been released and is due by 27th Jan.
  • [30/01/19] Assignment 2 has been released and is due by 8th Feb.
  • [10/02/19] Assignment 3 has been released and is due by 20th Feb.
  • [13/03/19] Assignment 4 has been released and is due by 23rd March.
  • [10/04/19] Assignment 5 has been released and is due by 21st April.
  • [10/04/19] End-term project evaluation will be held on 6th May.

Topics to be covered (tentative)

  • Deep Learning in computer vision: the data-driven paradigm, feed forwards networks, back-propagation and chain rule; CNNs and their building blocks, generative adverserial networks (GANs), Variational Autoencoders (VAEs) and Conditional Variational Autoencoders (CVAEs)
  • Deep Learning applications including face detection, CNN compression, siamese and triplet networks and applications to face recognition
  • Camera geometry, camera calibration, vanishing points, important transformations, homographies
  • Image registration: RANSAC for point-matching, SIFT overview
  • Algorithms for: shape from shading, optical flow, Kanade-Lucas-Tomasi algorithm, applications of optical flow
  • Photometric stereo - deriving shape from multiple images of an object taken under different lighting conditions; applications to illumination invariant face recognition, face relighting
  • Stereo (geometric binocular): epipolar geometry and fundamental matrix, the correspondence problem and shape from stereo; structure from motion

Learning materials and textbooks

Grading Policy

  • Mid-sem exam: 20%
  • Final exam (cumulative): 20%
  • Assignments (five or six): 35% (all to be done in groups of 2-3 students)
  • Course project: 20% (to be done in the same group of 2-3 students)
  • Class participation: 5%
  • Course project work will be presented by the student group during a viva at the end of the course. During this viva, each student in the group will be separately questioned, not only on the project work, but also the assignments. Each student is expected to contribute to each and every assignment and the course project.
  • Audit requirements: You must write both exams, submit all assignments and the project, and score at least 40% to get an AU.

Other Policies

  • Assignments will be given out (typically) once every two or three weeks. They must be submitted on or before the deadline. No late assignments will be accepted. The programming components of the assignments will typically involve MATLAB and lua, so you must be willing to learn it quickly.
  • We will adopt a zero-tolerance policy against any forms of plagiarism or any other form of cheating. Just don't do it! In cases of plagiarism, givers and takers will both be considered equally responsible.
  • This course is (inherently) cumulative. The syllabus for the final exam will include everything taught during the semester.

Course Projects

As mentioned in the grading policy, this course has a project requirement which will be 20% of your grade. The project needs to be done in a group of 2-3 students. We will send out a form which needs to be filled up with your project proposal. For a list of projects, please check this link

Assignments

There will be 5-6 assignments in this course. They will be a mix of theoretical and programming questions.
  • Assignment 1 on Camera Geometry has been released and is due by 27th Jan.
  • Assignment 2 on Camera Calibration, Image Alignment and Robust Methods has been released and is due by 8th Feb.
  • Assignment 3 on Neural Network and Backpropagation has been released and is due by 20th Feb. Please use this Kaggle link to test your predictions and class standing.
  • Assignment 4 on Recurrent Neural Network has been released and is due by 23rd March. Please use this Kaggle link to test your predictions and class standing.
  • Assignment 5 on Lucas-Kande Tracker and Video Stabilization has been released and is due by 21st April.

Lecture Schedule:

Date Topics Slides iTorch Notebooks Extra Reading
7th Jan, 2019
  • Introduction to computer vision, applications and course overview
    Slides -- --
    8th Jan, 2019 Camera Geometry
    • Homogeneous coordinates and projective geometry
    • Vanishing points, ideal line, point line duality in P2
    • Introduction to the pin-hole camera model
    Slides -- Homogeneous Representations of Points, Lines and Planes
    14th Jan, 2019
    • Important 2D and 3D transformations using homogenous coordinates
    • Modeling the pinhole camera analytically, intinsic and extrinsic parameters
    • World, camera, image plane and sensor plane coordinate systems and transformations between them
    Slides -- --
    15th Jan, 2019
    • Linear and non-linear (lens distortion) errors
    • Homography, planar world and pure rotation of the camera
    • Iterative solutions for dealing with with non-linear (lens distortion) errors
    • Normalized, ideal, euclidian, affine and general camera models
    • Orthographic and weak-perspective camera models
    Slides -- --
    21st Jan, 2019
    • Cross ratios and its applications
    • Camera calibration using DLT (known 3D control points)
    • Introduction to Zhang's camera calibration method
    Slides -- Resource on SVD
    Additional slides and notes on solving homogenous least squares problem
    22nd Jan, 2019
    • Zhang's camera calibration method, mention of a few DL based calibration methods
    Image Alignment
    • Image alignment: problem statement, physically and digitally corresponding points
    • Motion models and degrees of freedom; non-rigid/deformable/non-parametric image alignment
    • Control point based image alignment using least squares - derivation for pseudo-inverse
    • Introduction to the SIFT algorithm
    • Forward and reverse image warping - bilinear and nearest-neighbor interpolation
    • Mention of DL based image patch descriptors
    Slides -- --
    28th Jan, 2019
    • Image alignment using image similarity measures: mean squared error, normalized cross-correlation
    • Concept of field of view in image alignment using image similarity measures
    • Monomodal and multimodal image alignment
    • Concept of joint histograms and behaviour of joint histograms in multi-modal image alignment
    • Concept of entropy and joint entropy, algorithm for multimodal registration by minimizing joint entropy
    • Aspects of image registration: 2D/3D, motion model, monomodal or multimodal
    • Application scenarios for image alignment: template matching, video stabilization, panorama generation, face recognition, 3D to 2D alignment
    Slides -- --
    29th Jan, 2019 Robust Methods in Computer Vision
    • Least squares problems and their relation to the Gaussian distribution on the noise
    • Examples of outliers in computer vision
    • Explanation of why the Gaussian distribution is unsuited to handling outliers
    • Introduction to the Laplacian distribution
    • The importance of heavy-tailed distributions in robust statistics
    • RANSAC (random sample consensus) algorithm
    Slides -- --
    4th Feb, 2019 Deep Learning for Computer Vision
    • History, introduction
    • Data driven paradigm
    • K-NN on CIFAR 10
    • Hyperparameters, choice of loss function, cross-validation
    • Softmax classifier, cross-entropy loss function, regularization
    • Optimization: vanilla gradient descent, stochastic gradient descent
    Slides KNN Matrix calculus reminder
    5th Feb, 2019
    • Vanilla momentum, Nesterov momentum, AdaGrad, RMSProp, ADAM
    • Second order optimization methods, it's issues with deep learning
    • Good learning rate, learning rate decay
    • Feed forward, back-propagation
    • Fully connected layer
    Slides Gradient Check, Linear Layer ADAM, Nesterov
    DL optimization algorithms overview
    11th Feb, 2019
    • Activation functions: sigmoid, tanh, ReLU, LeakyReLU, ELU, etc.
    • Convolutional layer, dilated convolutions.
    Slides Convolution Convolution arithmetic for deep learning
    12th Feb, 2019
    • Convolutions: transposed, dilated, fully-connected as convolution, sliding window as convolution
    • Max-pooling, Dropout
    • SoftMax, Cross Entropy
    Slides Transposed convolution, MaxPool, Cross Entropy --
    18th Feb, 2019
    • Data Augmentation, hyperparamter selection
    • Weight initialization
    • Babysitting the learning process
    Slides Weight Initialization --
    19th Feb, 2019
    • ConvNet applications
    • ConvNet case studies: AlexNet, ZF-Net, VGGNet, GoogleNet, ResNet, SE-Net
    • Transfer Learning
    Slides -- --
    4th March, 2019
    • Object Detection: RCNN, Fast-RCNN, Faster-RCNN, YOLO, SSD
    Slides -- --
    5th March, 2019
    • Object Detection evaluation metrics: IoU, mAP
    • Object Detection details: RoIAlign, Feature Pyramid Network, Mask-RCNN, Focal Loss
    Slides -- --
    11th March, 2019
    • RNNs, LSTMs
    Slides -- --
    12th March, 2019
    • Visualizing and understanding ConvNets
    • Images that maximize ConvNet class scores, reconstructing images from ConvNet codes
    • Deep Dream, Neural Art, Adversarial Examples
    • Dimentionality reduction: siamese and triplet networks
    Slides -- --
    18th March, 2019
    • Neural Style Transfer
    • Autoencoders
    • Generative modeling: VAEs, GANs
    • Case studies: pix2pix, CycleGAN, UNIT
    Slides 1 Slides 2 -- --
    26th March, 2019 Orthographic Structure from Motion
    • Factorization Method
    • Rank Therorem
    Slides -- --
    1st April, 2019 Optical Flow
    • Dealing with the aperture problem: regularization
    • Horn and Shunck method: algorithm using discrete formulation, steps of Jacobi's method for matrix inversion, and comments about limitations
    Slides -- --
    2nd April, 2019
    • Lucas-Kanade method for Optical Flow
    • Multi-Scale Lucas-Kanade method
    • Comparison of Horn-Shunk and Lucas-Kanade algorithms
    • Applications of Optical Flow
    Slides -- --
    8th April, 2019 Kanade-Lucas-Tomasi (KLT) Featurepoint Tracker
    • Tracking feature-points from a template by estimating motion parameters.
    • Finding good features to track.
    Slides -- Lucas-Kanade 20 Years On: A Unifying Framework
    9th April, 2019 Geometric Stereo
    • Orientation parameters for the camera pair and relative orientation.
    • Coplanarity constraint for corresponding points
    • Derivation and key properties of the Fundamental matrix
    • 8-Point Algorithm
    Slides -- --
    15th April, 2019
    • Introduction to epipolar geometry
    • Essential matrix
    Slides -- Epipolar Geometry
    16th April, 2019
    • Generating the normalized stereo case from arbitrary views
    • Triangulation
    • Popular parameterizations for the relative orientation
    Slides -- --