Objective: Implement a Fully Convolutional Network (FCN) in TensorFlow to segment road pixel in images.
Approch: Adapted a pre-trained FCN to a simplified version of the FCN8. Added 1x1 convolutions to the pretrained VGG16 feature encoder reducing into a 2-pixel classifier (road/no-road). Also added 2 transposed convolutions scaling up the image to its original size, shape = (160, 576). Connected 2 skip layers from layer 3 and 4 to improve resolution.
After the training process all test images are processed to highlight the detected road pixels in green color overlay. Please see complete results of the fully trained and adapted FCN. The project solution must satisfy the rubric conditions.
The network is trained on the KITTI Raod dataset (289 images) for 50 epochs with batch size of 5 and learning rate 0.0001. Other batch sizes showed significant differences in accuracy and 5 seems a good balance between preventing early over-fitting, exhaustive memory usage and high accuracy. Higher learning rate showed less accuracy.
Accuracy and total loss running BATCH_SIZE=5, LEARNING_RATE=0.0001 and EPOCHS=50:
Accuracy and total loss running BATCH_SIZE=(5, 10, 15), LEARNING_RATE=0.0001 and EPOCHS=15:
Accuracy and total loss running BATCH_SIZE=5, LEARNING_RATE=(0.0001, 0.0005, 0.001) and EPOCHS=15:
Good road pixel detection:
Poor road pixel detection:
To run the project several technical preconditions need to be established.
main.py
will check to make sure you are using GPU - if you don't have a GPU on your system, you can use AWS or another cloud computing platform.
If you run a GPU double-check compatibility of package versions, eg. tensorflow 1, cuda 9, cudnn 7 !
Make sure the following is installed:
git clone https://github.com/tochalid/SS_Cman.git
Download and extract the dataset in the data
folder. This will create the folder data_road
with all the training and test images.
python main.py
To start the TensorBoard service run command:
tensorboard --logdir ./tb
Call http://TB_HOSTNAME:6006 in your browser
- The link for the frozen
VGG16
model is hardcoded intohelper.py
. The model can be found here - The model is not vanilla
VGG16
, but a fully convolutional version, which already contains the 1x1 convolutions to replace the fully connected layers. Please see this forum post for more information. - The Udacity project repository is here
- How to use TensorBoard see here