We all use tensorflow , keras , for making convolution layers , dense layers , activations etc. Often people are not sure about what is going in the back , what is the mathematics behind it and how it is implemented. So I have tried to implement Neural Network from scratch and ran it on MNIST dataset.
Following libraries were used to build the NN from scratch
- numpy : Neural Network is implemented using Numpy only.
- matplotlib : Used to view the accuracy of MNIST dataset.
- google colab : Trained it using google colab. One may train it on local machine aswell.
The whole concept can be divided as: 1.Layer 2.Relu 3.DenseLayer 4.Lossfunction 5.GradientofLossFunction
- First we initialize random weights in each layer.
- Forward propagation of input takes place. Suppose first dense layer is f(X) = WX + B and second is (here) Relu. So after relu we get max(f(X),0) as the result.Similiarly, here we have built three dense layers and two relus.
- Activations array will store results after every layer. So activation[0] corresponds to results after dense layer 1
- Now as the input travels to the end we get y'. Now we canculate error. Here we have used Xentropy as error.
- Now we backpropagate as suppose L is loss then , dL/dW1 and dl/dW2 are gradient to last nodes(if two) , then one may get other dl/dWs using chain rule.
- Blue - Training accuracy
- Orange - Validation accuracy
- Add better weights initializations
- Regularisation
- Image scrapper for extracting numbers from google