The goal of this project is to learn how to build a Neural Network that has:
- Input: a monocular RGB Image
- Output: a Depth Map, and a Segmentation Map
A single model, two different outputs. For that, the model will need to use a principle called Multi Task Learning. To do that, I define the model from the paper Real-Time Joint Semantic Segmentation and Depth Estimation Using Asymmetric Annotations, which takes an input RGB image, make it go through an encoder(MobileNetV2), and a lightweight refinenet as decoder, and then has 2 heads, one for each task.
Here is the result on video1:
Here is the result on video2:
See the training folder