This project is aiming to recognize the gray-scale images of hand-drawn digits from zero through nine using FFNN. Moreover, the effects of number of layers, different regularization techniques (dropout, L1 regulariztion, L2 regularization), learning rates, epochs, batch sizes and width of layers have been investigated to incease the prediction accuracy of the model.
The data can be accessed here: Data
The best FFNN has four hidden layers with ReLU as the activation functions. The first, second, third and fourth hidden layers have 300, 300, 200 and 100 neurons, respectively. The output layer outputs the logits, which then goes through the softmax activation function. We utilized cross entropy as the loss function and gradient descent optimizer to train the model. The network takes 28*28 features as the number of inputs for each instance, and outputs 10 probabilities for each class. The predicted class is determined by the highest probability among the ten.
Fig 1. Tensorboard Computational Graph (left) and Simplified Flowchart (right)
Based on the optimized architecture, the training set's accuracy reached 100% and its loss was 0.004, and the test set's obtained an accuracy of 97.3% and its loss was 0.1 as shown in the follow graphs.
Fig 2. Accuracy (left) and Loss (right) For Training and Testing Data With Epochs (graphs from tensorboard)