/computer-vision-ND

Projects and exercises for the Udacity Computer Vision Nanodegree

Primary LanguageJupyter Notebook

Computer Vision Nanodegree

This repository contains my exercises and projects for the Computer Vision Nanodegree at Udacity.

Project 1: Facial Keypoint Detection

Facial Keypoint Detection Project

In this project, I build a facial keypoint detection system. The system consists of a face detector that uses Haar Cascades and a Convolutional Neural Network (CNN) that predict the facial keypoints in the detected faces. The facial keypoint detection system takes in any image with faces and predicts the location of 68 distinguishing keypoints on each face.

Some results from my facial keypoint detection system:


The Udacity repository for this project: P1_Facial_Keypoints

Project 2: Image Captioning

Image Captioning Project

In this project, I design and train a CNN-RNN (Convolutional Neural Network - Recurrent Neural Network) model for automatically generating image captions. The network is trained on the Microsoft Common Objects in COntext (MS COCO) dataset. The image captioning model is displayed below.

Image Captioning Model Image source

One good and one not so good sample made by my model:

sample_171
sample_193

The Udacity repository for this project: CVND---Image-Captioning-Project

Project 3: Landmark Detection

Landmark Detection Project

In this project, I implement SLAM (Simultaneous Localization and Mapping) for a 2-dimensional world. Sensor and motion data gathered by a simulated robot is used to create a map of an environment. SLAM gives us a way to track the location of a robot in the world in real-time and identify the locations of landmarks such as buildings, trees, rocks, etc.

The Udacity repository for this project: Project_Landmark Detection

Exercises

  • Image Representation & Classification - In this exercise, I learn how images are represented numerically and implement image processing techniques, such as color masking and binary classification.
  • Convolutional Filters and Edge Detection - In this exercise, I learn about frequency in images and implement my own image filters for detecting edges and shapes in an image. Use Haar cascade classifiers from the OpenCV library to perform face detection.
  • Types of Features & Image Segmentation - In this exercise, I program a corner detector and learn techniques, like k-means clustering, for segmenting an image into unique parts.
  • Feature Vectors - In this exercise, I learn how to describe objects and images using feature vectors (ORB, FAST, BRIEF, HOG).
  • CNN Layers and Feature Visualization - In this exercise, I define and train my own convolution neural network for clothing recognition. Learn to use feature visualization techniques to see what the network had learned.
  • YOLO - In this exercise, I learn about the YOLO (You Only Look Once) multi-object detection model and work with a YOLO implementation. Implement YOLO to work with my webcam.
  • LSTMs - In this exercise, I learn about Long Short-Term Memory Networks (LSTM), and similar architectures which have the benefits of preserving long-term memory. Implement a Character-Level LSTM model.
  • Attention Mechanisms - Todo.

The Udacity repository for the exercises: CVND_Exercises

Localization Exercises

  • Optical Flow - In this exercise, I learn about and implement Optical Flow.
  • Robot Localization - In this exercise, I learn how to implement a Bayesian filter to locate a robot in space and represent uncertainty in robot motion.
  • Mini-project: 2D Histogram Filter - In this exercise, I write sense and move functions for a (and debug) 2D histogram filter.
  • Introduction to Kalman Filters - In this exercise, I learn the intuition behind the Kalman Filter, a vehicle tracking algorithm, and implement a one-dimensional tracker.
  • Representing State and Motion - In this exercise, I learn to represent the state of a car in a vector that can be modified using linear algebra.
  • Matrices and Transformation of State - In this exercise, I learn about the matrix operations that underly multidimensional Kalman Filters.
  • Simultaneous Localization and Mapping (SLAM) - In this exercise, I learn how to implement SLAM: simultaneously localize an autonomous vehicle and create a map of landmarks in an environment.
  • Vehicle Motion and Calculus - In this exercise, I review some basic calculus and learn how to derive the x and y components of a self-driving car's motion from sensor measurements and other data.

The Udacity repository for the exercises: CVND_Localization_Exercises