/all_convolutional_net

TensorFlow implementation of "Striving for Simplicity: The All Convolutional Net" on MNIST

Primary LanguagePython

TensorFlow implementation of all convolution neural networks[1] on MNIST

Introduction

[1] proposed all convolutional neural networks and showed that an All-CNN-C model achieved the state of the art on CIFAR-10 classification error.

But I am not sure that the All-CNN-C model also works well for MNIST data.

This repository contains a TensorFlow implementation of the all conv nets on MNIST. Codes are heavily based on TensorFlow tutorial code on MNIST.

Model

Two all conv nets I tried:

  1. All-CNN-C model ALL_CNN_C.py: the same network structure as [1]

  2. Smaller model smaller_all_conv_net.py: has the smaller number of convolutional layers than All-CNN-C.

Results

The smaller model achieve 1% test error on MNIST at 10 epoch. Note that the original code on MNIST achieved 0.8% test error.

On the other hand, the validation error rate of All-CNN-C model is 26.8% after 5 epochs (smaller model 1.4%). As described in [2], it would be still valid on MNIST that the deeper models have difficulty in being optimized.

References:

[1] Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, Martin Riedmiller, Striving for Simplicity: The All Convolutional Net, arXiv

[2] Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber, Highway Networks. arXiv

[3] All-Convnet-TensorFlow-MNIST-Tutorial