/LeNet-5-MNIST

with Keras

Primary LanguageJupyter Notebook

LeNet-5 & MNIST

MNIST

Download mnist.pkl.gz from link

Load dataset using load_data.py

dataset visualization

See tile_view_util.py & visualization.ipynb

Another link1 and link2 for dataset dimension-reduced visualization


LeNet - 5^Lecun et al., 1998 Architecture


LeNet-5 Keras Implementation


Baseline

Implementation:

Baseline ipynb & training log

Details:

  • kernel initializer: Xavier^Glorot et al., 2010 Normal
  • optimizer: SGD
  • learning rate: $\alpha = 1$
  • batch size: 128
  • epoch: 20

Experiment 1: Different Kernel Initializer Affect on Performance

Since He initialization^He et al., 2015 is optimal for ReLU activation, I've tried both he_uniform and he_normal. Besides, I also experimented withrandom_normal as a control.

Implementation:

Result:

He Normal and He Uniform do not differ much in terms of accuracy performance, but He Normal is slightly outperforms He uniform.


Experiment 2: Different Optimizers Affect on Performance

I've tried Momentum ($\beta=0.9$), RMSprop ($\beta = 0.9$) and Adam ($\beta_1=0.9,\ \beta_2=0.999$), and compare the performance on accuracy.

Implementation:

Result:

RMSprop slightly outperforms Adam


Final Stage: build an optimal model based on results from Experiment 1 and 2

  • optimizer: RMSprop
  • initializer: He Normal

Implementation:

Optimal

Final Result:

accuracy on test set: 98.93%


Extra: attempt to ameliorate overfitting

Applied Dropout on the last 2 Full Connected Layers of the optimal model, keep_prob = 0.7.

Besides, applied L2_regularization on the last 2 Full Connected Layers of the optimal model respectively. $\lambda=0.01$. (reg)

Data Augmentation: (datagen)

datagen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2
)

Implementation:

Dropout

L2 Regularization

Data Augmentation

Result:


Appendix

plot data: plot_data.ipynb