In this assignment, we developed a feed forward neural network from scratch. We have used gradient descent method and its variants as optimization algorithm with backpropogation to classify images from Fashion-MNIST dataset. We used "wandb.ai" to perform experiments for hyperparameter tuning.
- Numpy: Mathematical operations are performed by this library
- Keras: This library is used to obtain the dataset.
- Matplotlib and Seaborn: Sample images from each class and Confusion Matrix are plotted using these libraries respetively
- sklearn: The dataset is split into Train-Test-Validation by this library
- wandb: This library is used to log the metrics to wandb.ai.
The above mentioned libraries can be installed on local machine by using the following code snippet in the command prompt:
pip install numpy
pip install keras
pip install matplotlib
pip install seaborn
pip install sklearn
pip install wandb
If you are running the code on Google colab, all the above mentioned libraries are already installed except "wandb". Add the following code in a cell
!pip install wandb
To train the neural network use the following function:
fit(X_train,
y_train,
layer_sizes,
wandb_log,
learning_rate = 0.0001,
initialization_type = "random",
activation_function = "sigmoid",
loss_function = "cross_entropy",
mini_batch_Size = 32,
max_epochs = 5,
lambd = 0,
optimization_function = mini_batch_gd):
X_train
stores the list of flattened images of training dataset.y_train
stores the list of labels for the images of training dataset in one-hot encoded format.layer_sizes
stores the number of neurons present in each layer, including the input and the output layers.wandb_log
stores the boolean variable which determines whether or not the data is logged into wandb.ailearning_rate
stores the learning rate of the gradient descent(and its variants) optimization functionsinitialization_type
stores the weight initialization type, you can choose:Xavier
orrandom
activation_function
stores the activation function that is applied to all the hidden layersloss_function
stores the type of loss function, you can choose:cross_entropy
orsquared error
mini_batch_size
stores the number of data points per batch.max_epochs
stores the maximum number of epochslambd
stores the regularization constant for weight decayoptimization_function
stores the name of gradient descent algorithm
We have given a template for adding a optimization function on the similar lines of previous functions. <\br> The user need to add the following code snippets to form a new optimization function.
- Declare and Initialize dictionaries and other data structures as per the requirement of optimization function.
- New parameter update rule for the network parameters.
The new optimization function looks like this :
new_optimization_function_name(X_train, y_train, eta, max_epochs, layers, mini_batch_size, lambd, loss_function, activation, parameters,wandb_log=False )
X_train
stores the list of flattened images of training dataset.y_train
stores the list of labels for the images of training dataset in one-hot encoded format.eta
stores the learning rate.max_epochs
stores the maximum number of epochs.layers
stores the number of neurons per each layer.mini_batch_size
stores the number of data points per batch.lambd
stores the regularization constant for weight decay.loss_function
stores the type of loss function, you can choose:cross_entropy
orsquared error
.activation
stores the activation function that is applied to all the hidden layers.parameters
stores the intial parameters (weights and biases).wandb_log
stores the boolean variable which determines whether or not the data is logged into wandb.ai
- To
use wandb mode
, find yourAPI key
from your wandb account and paste it in the output box after you executed this code snippet :
!wandb login --relogin
# enter the entity and project name in these variables
entity_name="_entity_name_"
project_name="_project_name_"
- You can
perform experiments
by running the sweeps, using this function:
sweeper(entity_name,project_name)
- You can
compare
the performance of twoloss functions
by using this function:
loss_compare_sweeper(entity_name,project_name)
- You can plot the
confusion matrix
for the test dataset by using this function, this returns predicted labels and true labels:
y_pred,y_t=plot_confmat_wandb(entity_name,project_name)