/RockTheBoar

Convolutional Neural Network pipeline for pixel-wise recognition of cars in an image.

Primary LanguagePython

RockTheBoar

Code written as part of the Kaggle Carvana Image Masking Challenge. The overal goal of this competition was to design an algorithm to accurately mask cars in images.

This network was used for our final submission. The architecture is a 12 layer autoencoder-style fully convolutional neural network - 6 convolutional layers followed by 6 deconvolutional layers. Dice co-effecient is used as a accuracy metric, and Adaptive Moment Estimation (Adam) is used for gradient descent optimization.

The network is trained on images of cars, each at 16 different angles, with corresponding masks as targets. The fully trained network provides a pixel-wise "probability of car" estimate.

Functions of some of the key scripts:

  • convnet_architecture_12l.py: The network architecture, written in tensorflow.
  • train_network.py: The training framework.
  • input_pipeline.py: Image loading functions.
  • infer_function.py: Loads the trained model for inference on test images.
  • filter_csv.py: Post-processing pipeline with tuned filtering, smoothing and tuned filtering.

Initial Prototyping for a simple convolutional network of 3 layers is implemented. The input/output pipelines were forked from Bruno G. do Amaral's Naive Keras model. Network architecture can be edited in simple_model.py.

Authors

Donald Lee-Brown(@dleebrown), Sinan Deger(@sinandeger) and Nesar Ramachandra (@nesar)