The aim of the project is to train a network to Classify each pixel in the Image to two classes -Road/Not Road.
An Encoder-Decoder Model is used for this classification.
We use Transfer learning to get the weights of the existing VGG-16 Model.
Convert the Classic VGG-16 to a Fully convolutional Network,and Transfer the learning from the network for our task. we then combine the information from the deep layer to that of the shallow layer that has the appearence to get the proper segmentation by using the Skip layer architecture. from this paper The VGG-16 Model is considered the Encoder and a a decoder is developed by upconverting the layers of decoder and also using the Skip layer architecture to produce proper results.
The complete Model is shown below.
This is similiar to the one used in this paper.
Training is done with KITTI Road Dataset
I trained the network in Floydhub.I ran it for 20 epochs with a Batch size of 10.It took me around 29 Minutes to Train and create Test run Images. I reached a validation loss around 0.03.
As seen this gives some reasonable output.This may not be the best yet though. May be some additional epochs should be run and then IOU should be calculated to determine how well the network performs.