Multilayer Perceptron

A couple of years ago I had to come up with a project for a course and decided to code a Neural Network in C++.

This repository is a slightly brushed up version of this project, it makes use of the original Neural Network class written for the course (header/NeuralNetwork.h) and I have added python scripts for conveniently generating data and plotting errors.

The pipeline

creates training and testing samples of the XOR or CSWAP gate (a 3-bit gate)
fits the model to the training data and evaluates the performance on the test set.
creates a plot of the error measure

Logical Gates | Neural Network | Pipeline

Logical gates

The Neural Network (header/NeuralNetwork.h) can be initialized with different numbers of neurons in each layer. This invites to experiment with more exotic types of gates - but first:

XOR

Logical gates such as e.g. AND, OR can typically modeled by a single perceptron. The exclusive OR (XOR) is an exception to that as it is not linearly separable and a Neural Network requires a hidden layer to emulate it.

The XOR returns True if the inputs are distinct and False otherwise:

Input 1	Input 2	Output
0	0	0
0	1	1
1	0	1
1	1	0

CSWAP

The Controlled SWAP (CSWAP) gate or Fredkin gate (wikipedia) has 3 input and 3 output bits and can be used in Quantum Computing. It works as follows:

If the C-bit is non-zero, Input 1 and Input 2 are switched. This is conveyed in the following table

C	Input 1	Input 2	C	Output 1	Output 2
0	0	0	0	0	0
0	0	1	0	0	1
0	1	0	0	1	0
0	1	1	0	1	1
1	0	0	1	0	0
1	0	1	1	1	0
1	1	0	1	0	1
1	1	1	1	1	1

Neural Network

The Neural Network has 3 layers: input, output and a hidden layer*. The numbers of neurons per layer is variable and the number of input/output neurons will be inferred from the data provided.

For the XOR gate, the network will have 2 inputs and 1 output. In the case of the CSWAP gate the network will have 3 inputs and 3 outputs. The number of hidden neurons is 2 by default and should be adjusted upwards when training the model on CSWAP data.

As activation function the hyperbolic tangent function is used: tanh(x)

Loss: Root Mean Square error (RMS):

where the index i runs over all neurons of the layer (here that is only one).

At every step a smoothed error measurement (RAE) is put out, recursively it can be defined like this:

with the smoothing factor xi. The intuition here is: the smoothing factor adjusts how much a single loss value contributes to the RAE

By default the training goes over 30 epochs but the loop is broken as soon as the RAE drops below a certain threshold (0.0001).

Pipeline

Generating data

g++ -o data_gen data_gen_XOR.cpp
./data_gen

g++ -o data_gen data_gen_CSWAP.cpp
./data_gen

by default this creates 250 training samples and 100 test samples.

Training

g++ -o training training.cpp
./training

The default settings are

learning rate (eta) = 0.08
momentum (alpha) = 0.1

Note: When training on the CSWAP data, the Neural Network will automatically initialized with 3 Input and 3 Output neurons, however 2 hidden neurons appear to be insufficient to train the network!

Plotting the error curve

The plotting script utilizes numpy and matplotlib:

pip install numpy
pip install matplotlib

then run

python plot.py

to obtain a plot like this: