Assignment 1 : Developing Backpropagation from scratch

In this assignment, we developed a feed forward neural network from scratch. We have used gradient descent method and its variants as optimization algorithm with backpropogation to classify images from Fashion-MNIST dataset. We used "wandb.ai" to perform experiments for hyperparameter tuning.

Libraries and their application :

Numpy: Mathematical operations are performed by this library
Keras: This library is used to obtain the dataset.
Matplotlib and Seaborn: Sample images from each class and Confusion Matrix are plotted using these libraries respetively
sklearn: The dataset is split into Train-Test-Validation by this library
wandb: This library is used to log the metrics to wandb.ai.

Installations:

The above mentioned libraries can be installed on local machine by using the following code snippet in the command prompt:

pip install numpy
pip install keras
pip install matplotlib
pip install seaborn
pip install sklearn
pip install wandb

If you are running the code on Google colab, all the above mentioned libraries are already installed except "wandb". Add the following code in a cell

!pip install wandb

Training the Neural Network:

To train the neural network use the following function:

fit(X_train, 
    y_train,
    layer_sizes,
    wandb_log, 
    learning_rate = 0.0001, 
    initialization_type = "random", 
    activation_function = "sigmoid", 
    loss_function = "cross_entropy", 
    mini_batch_Size = 32, 
    max_epochs = 5, 
    lambd = 0,
    optimization_function = mini_batch_gd):

X_train stores the list of flattened images of training dataset.
y_train stores the list of labels for the images of training dataset in one-hot encoded format.
layer_sizes stores the number of neurons present in each layer, including the input and the output layers.
wandb_log stores the boolean variable which determines whether or not the data is logged into wandb.ai
learning_rate stores the learning rate of the gradient descent(and its variants) optimization functions
initialization_type stores the weight initialization type, you can choose: Xavier or random
activation_function stores the activation function that is applied to all the hidden layers
loss_function stores the type of loss function, you can choose: cross_entropy or squared error
mini_batch_size stores the number of data points per batch.
max_epochs stores the maximum number of epochs
lambd stores the regularization constant for weight decay
optimization_function stores the name of gradient descent algorithm

Addition of a new Optimization Function:

We have given a template for adding a optimization function on the similar lines of previous functions. <\br> The user need to add the following code snippets to form a new optimization function.

Declare and Initialize dictionaries and other data structures as per the requirement of optimization function.
New parameter update rule for the network parameters.

The new optimization function looks like this :

new_optimization_function_name(X_train, y_train, eta, max_epochs, layers, mini_batch_size, lambd, loss_function, activation, parameters,wandb_log=False )

X_train stores the list of flattened images of training dataset.
y_train stores the list of labels for the images of training dataset in one-hot encoded format.
eta stores the learning rate.
max_epochs stores the maximum number of epochs.
layers stores the number of neurons per each layer.
mini_batch_size stores the number of data points per batch.
lambd stores the regularization constant for weight decay.
loss_function stores the type of loss function, you can choose: cross_entropy or squared error.
activation stores the activation function that is applied to all the hidden layers.
parameters stores the intial parameters (weights and biases).
wandb_log stores the boolean variable which determines whether or not the data is logged into wandb.ai

Wandb Functionality:

To use wandb mode, find your API key from your wandb account and paste it in the output box after you executed this code snippet :

!wandb login --relogin
# enter the entity and project name in these variables
entity_name="_entity_name_"
project_name="_project_name_"

You can perform experiments by running the sweeps, using this function:

sweeper(entity_name,project_name)

You can compare the performance of two loss functions by using this function:

loss_compare_sweeper(entity_name,project_name)

You can plot the confusion matrix for the test dataset by using this function, this returns predicted labels and true labels:

y_pred,y_t=plot_confmat_wandb(entity_name,project_name)

Available options to customize the Neural Network:

1) Loss functions

MSE()
CrossEntropy()

2) Optimization functions

mini_batch_gd()
momentum_gd()
nesterov_gd()
rmsprop()
adam()
nadam()

3) Weight Initializations

Xavier()
Random()

4) Activation Functions

sigmoid()
tanh()
relu()
softmax()

vamsimalineni96/Fundamentals-of-Deep-Learning-CS6910-Assignment1

Assignment 1 : Developing Backpropagation from scratch

Libraries and their application :

Installations:

Training the Neural Network:

Addition of a new Optimization Function:

Wandb Functionality:

Available options to customize the Neural Network:

1) Loss functions
MSE() CrossEntropy()

2) Optimization functions
mini_batch_gd() momentum_gd() nesterov_gd() rmsprop() adam() nadam()

3) Weight Initializations
Xavier() Random()

4) Activation Functions
sigmoid() tanh() relu() softmax()

vamsimalineni96/Fundamentals-of-Deep-Learning-CS6910-Assignment1

Assignment 1 : Developing Backpropagation from scratch

Libraries and their application :

Installations:

Training the Neural Network:

Addition of a new Optimization Function:

Wandb Functionality:

Available options to customize the Neural Network:

1) Loss functions MSE() CrossEntropy()

2) Optimization functions mini_batch_gd() momentum_gd() nesterov_gd() rmsprop() adam() nadam()

3) Weight Initializations Xavier() Random()

4) Activation Functions sigmoid() tanh() relu() softmax()

1) Loss functions
MSE() CrossEntropy()

2) Optimization functions
mini_batch_gd() momentum_gd() nesterov_gd() rmsprop() adam() nadam()

3) Weight Initializations
Xavier() Random()

4) Activation Functions
sigmoid() tanh() relu() softmax()