/digit-classifier

Deep Learning project with MNIST dataset to help machines recognizing handwritten digits

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

handwritten-digit-classifier

Deep Learning project using MNIST dataset. With this network max 88.6 +/- 0.2 percent was observed.


Neural Network Structure

image

Gradient descent to find global minima

image

Sigmoid function

Reduce any real number value in between 0 to 1.0 image


Introduction

Goal is to classify hand written digits. I've used [784-16-16-10] layer structure. One can play around with the net structure, keeping first and last layer fixed.

STRUCTURE

  • Dataset:

    • Get training and test dataset from MNIST dataset.
    • Each data is a (784x1) vector
    • Each label is a (10x1) vector
    • A total of 60000 training and 20000 test dataset
    • Mini batch size is 16 i.e. each mini_batch contains 16 train dataset
  • Network:

    • Initialize random weights and biases for all layers
    • For each epoch
      • For each mini batch
        • For each data and label in mini batch
          • Activations calculation
          • Loss calculation
          • Delta values calculation with Gradient descent method
          • Nabla values calculation
          • Update weights and biases with learning rate 0.1
      • Evaluate network's accuracy for the test dataset

NB: Output layer consists total 10 neurons. Activation of each neuron denotes the corresponding digit.


Activation calculation

z = np.dot(w, a)+b
a = self.sigmoid(z)

Loss calculation

loss = np.sum(np.power(np.subtract(self.a[-1],y),2))/(2.0*self.dataset.mini_batch_size)

Cost derivative

cost_derivative = np.subtract(self.a[-1], y)

Derivative of activation

def sigmoid_derivative(a):
	return a*(1-a)


activation_derivative = sigmoid_derivative(activation)

Delta and nabla calculations

For output layer

delta = cost_derivative*self.sigmoid_derivative(self.a[-1])
self.nabla_w[-1] = np.dot(delta, self.a[-2].T)
self.nabla_b[-1] = delta

For hidden layers: codes for the previous layer of output

# For previous layer of the output-layer, L.
# delta in the right hand of the first eqn. is the output layer's delta
delta = np.dot(self.weights[L].transpose(), delta) * self.sigmoid_derivative(self.a[L-1])
self.nabla_b[L-1] = delta
self.nabla_w[L-1] = np.dot(delta, self.a[L-2].transpose())
weight = weight + (-learning_rate*mini_batch_size)*nabla_w
bias = bias + (-learning_rate*mini_batch_size)*nabla_b

Execute the codes

Install dependencies

pip3 install -r requirements.txt

Codes were written on Python v3.6.6

Run Network.py

python3 Network.py