LeNet-5 & MNIST

MNIST

Download mnist.pkl.gz from link

Load dataset using load_data.py

dataset visualization

See tile_view_util.py & visualization.ipynb

Another link1 and link2 for dataset dimension-reduced visualization

LeNet - 5^Lecun et al., 1998 Architecture

LeNet-5 Keras Implementation

Baseline

Implementation:

Baseline ipynb & training log

Details:

kernel initializer: Xavier^Glorot et al., 2010 Normal
optimizer: SGD
learning rate: $\alpha = 1$
batch size: 128
epoch: 20

Experiment 1: Different Kernel Initializer Affect on Performance

Since He initialization^He et al., 2015 is optimal for ReLU activation, I've tried both he_uniform and he_normal. Besides, I also experimented withrandom_normal as a control.

Implementation:

Result:

He Normal and He Uniform do not differ much in terms of accuracy performance, but He Normal is slightly outperforms He uniform.

Experiment 2: Different Optimizers Affect on Performance

I've tried Momentum ($\beta=0.9$), RMSprop ($\beta = 0.9$) and Adam ($\beta_1=0.9,\ \beta_2=0.999$), and compare the performance on accuracy.

Implementation:

Result:

RMSprop slightly outperforms Adam

Final Stage: build an optimal model based on results from Experiment 1 and 2

optimizer: RMSprop
initializer: He Normal

Implementation:

Optimal

Final Result:

accuracy on test set: 98.93%

Extra: attempt to ameliorate overfitting

Applied Dropout on the last 2 Full Connected Layers of the optimal model, keep_prob = 0.7.

Besides, applied L2_regularization on the last 2 Full Connected Layers of the optimal model respectively. $\lambda=0.01$. (reg)

Data Augmentation: (datagen)

datagen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2
)

Implementation:

Dropout

L2 Regularization

Data Augmentation

Result:

Appendix

plot data: plot_data.ipynb

PJY-609/LeNet-5-MNIST

LeNet-5 & MNIST

MNIST

dataset visualization

LeNet - 5^Lecun et al., 1998 Architecture

LeNet-5 Keras Implementation

Baseline

Implementation:

Details:

Experiment 1: Different Kernel Initializer Affect on Performance

Implementation:

Result:

Experiment 2: Different Optimizers Affect on Performance

Implementation:

Result:

Final Stage: build an optimal model based on results from Experiment 1 and 2

Implementation:

Final Result:

Extra: attempt to ameliorate overfitting

Implementation:

Result:

Appendix