Ge's note: A super nice set of vision projects used in course 504.
EECS 504- Winter 20
Pet Edge Detection
- Application of horizontal and vertical edge gradients and computation of total edge strength.
- Comparison of images for: Edges without blurring, Edges with Gaussian Filter, Edges with Box Filter.
- Computation of oriented edges in direction using horizontal and vertical gradients.
Image Blending
- Comparison of the output of a Gaussian filter through i) direct convolution in the spatial domain and ii) multiplication in the frequency domain.
- Constructing a Laplacian pyramids with 4 levels to reconstruct the original image.
- Blending two images: Using input of two images and a binary mask and produces the Laplacian pyramids with num levels levels for blending the two images.
Motion magnification in videos. Texture Synthesis: Method used for generating new textures from an initial sample texture.
Multi-layer perceptron
- Train a two-layer neural network to classify images using CIFAR-10. Network will have two layers, and a softmax layer to perform classication. Train the network to minimize a cross-entropy loss function (also known as softmax loss). The network uses a ReLU nonlinearity after the first fully connected layer.
- Setting up model hyperparameters (hidden dim, learning rate,lr decay, batch size) to get an accuracy above 45% on test data.
- Train a CNN to solve the scene recognition problem, i.e., the problem of determining which scene category a picture depicts.
- Train two neural networks, MiniVGG and MiniVGG-BN. MiniVGG is a smaller, simplified version of the VGG architecture, while MiniVGG-BN is identical to MiniVGG except that there are batch normalization layers after each convolution layer. Dataset used- MiniPlaces
Implement a single-stage object detector, based on YOLO v1 and v2. Unlike the (better-performing) R-CNN models, single-stage stage detectors predict bounding boxes and classes without explicitly cropping region proposals out of the image or feature map. This makes them significantly faster to run. Dataset used- PASCAL VOC
Implement two representation learning methods: an autoencoder and a recent constrastive learning method. Test the features that were learned by these models on a "downstream" recognition task, using the STL-10 dataset.
Given two input images, construct the image panorama, using keypoint detection, local invariant descriptors, RANSAC, and perspective warping. The panoramic stitching algorithm consists of four main steps:
- Detect keypoints and extract local invariant descriptors (using ORB) from two input images.
- Match the descriptors between the two images.
- Apply RANSAC to estimate a homography matrix between the extracted features.
- Apply a perspective transformation using the homography matrix to merge image into a panorama.
Implement the Lucas-Kanade (LK) optical flow algorithm for estimating dense motion between a pair of images.