This is my attempt for creating a neural network that learns the MNIST data.
The data can be found on this link:
http://yann.lecun.com/exdb/mnist/
(it can also be found in this repository in the BalaMnist/Data folder)
Usage for the MNIST data is the following:
printf("Loading data..\n");
// You may need to customize this function and the other functions called from here
// if you intend to use it for some other data
loadData();
printf("Loading data finished.\n");
printf("Normalizing data\n");
normalizeData();
printf("Adding bias\n");
// This is important, the NN assumes the data already has the bias as the first column
addBiasToData();
printf("Creating NN\n");
NeuralNetwork nn{ Xtrain.M() };
// Set custom parameters here
// Add a layer using addSimpleLayer function with the number of the nodes specified
// To use tanh as activation, use addSimpleLayer(nodeSize, tanhM, tanhGradientM)
nn.addSimpleLayer(200);
nn.addSimpleLayer(50);
// Last layer is the output layer - number of nodes should equal the number of labels in the classification data
// - don't use tanhM as activation on the last layer
// (tanh may evaluate to negative numbers, cost function will use log on them : NaN)
nn.addSimpleLayer(10);
printf("NN train \n");
// The list of lambdas (regularization parameters) to try, after training the neural network with each lambda
// the NN will choose the one with the least amount of cross validation error
std::vector<float> lambdas = { 0.0f, 0.01f, 0.03f, 0.1f, 0.3f, 1.0f, 3.0f, 10.0f };
int epochs = 1;
int batchSize = 1000;
float learningRate = 0.3f;
int gradIter = 200;
nn.trainComplete(Xtrain, ytrain, Xval, yval, epochs, batchSize, lambdas, learningRate, gradIter);
std::ofstream myfile;
myfile.open("weights.txt");
// To save layer weights, you can call saveThetas function with an ofstream file object as parameter
// The weights will be written out as matrices separated by comma and space, row by row
nn.saveThetas(myfile);
myfile.close();
std::ofstream file;
file.open("firstlayer.txt");
// To save the first layer in an n rows by m columns rearrangement with each neuron's weights as a 2d picture
// that can be copied in excel to visualize, use the function saveFirstLayerVisualization
nn.saveFirstLayerVisualization(file, 10, 20, 28, 28);
file.close();
printf("NN predict\n");
// p contains the neural network's predicted labels
Matrix p = nn.predict(Xtest);
// Test accuracy is achieved by comparing the predicted labels to the test labels
// which is a column matrix with 0's and 1's in it (1 when the prediction matched the test)
// and then you take the average from those values using meanAllM
printf("Test accuracy: %f\n", meanAllM(p == ytest));
Running this code yielded a 96,9 % test score with the regularization parameter lambda = 0.03
I also visualized the first layer (200 pictures in a 10 by 20 arrangement):
Usage for other data should be similar, but this project is not intended to be used as a deep learning library, since it has no such optimizations/parallelizations as the big frameworks out there (caffee, theano, tensorflow, etc), and as such it will run much slower than those mentioned frameworks. However it is a nice learning project for neural networks.
This project is based on the material in the Stanford coursera Machine Learning course: https://www.coursera.org/learn/machine-learning
I would like to express my thanks to my online teacher Andrew Ng who made all this possible.